## Clustering with the Knowm API

This article outlines a method for clustering spike patterns on kT-RAM using our AHaH instruction set. There are a number of reasons for clustering on a neuromorphic chip which pose as advantages for a computing systems. First off, algorithms which can be actualized using AHaH computing will likely benefit from speed and power-efficiency improvements, thus making them ideal for working with large datasets or alongside iteratively adapting algorithms. Secondly, the low energy costs associated with a clustering technique on kT-RAM allow us to implement a learning procedure in an energy-constrained environment. These include onboard satellite systems, remote sensors, and large-scale data centers.

Like in the other tutorials, this article series uses our chip emulator The Knowm API to introduce our method in code. Hopefully, this helps solidify understanding. For those interested, you can access the full code base by signing up for the Knowm Developer Community.

## Spike Representation

The first step in clustering generic data sets on kT-RAM is to first convert all values into a representation called a spike pattern.

[latex isblock=true] x \rightarrow \{ 1, 0, 0, 1, 1, 1, \ldots 0, 0, 1 \} [/latex]

Spike conversion is a translation into a discrete representation. Clustering is heavily dependent on our choice of representation and it is important to understand this translation, what data loss occurs? and what structure is generated by such a conversion?. In following examples, we explore the effect of encoding schemes on clustering.

## Spike Clustering

A kT-RAM clustering is performed by spiking a set of AHaH nodes -each accepting the same pattern- and reading their output as a cluster label.

[latex isblock=true] Clusterer = \{ A1, A2, A3, \ldots, AN-1, AN \} [/latex]

While iterating through each inputted spike pattern froma training set or continuous stream, these AHaH nodes are adapted with an unsupervised Hebbian instruction pair `FF-RU`

.

The `FF-RU`

instruction pairing forces each AHaH node into a randomly determined attractor state which represents a decision boundary in the spike space.

We showed in our PLOS paper that these attractor states were maximum-margin decision boundaries and that they were formed between the Independent Components of our spike patterns.

Once trained, spike patterns are clustered on kT-RAM by linking those patterns which activate the same set of AHaH nodes during a Feed-Forward (`FF`

) operation. A unique cluster label can then be associated to each.

If we had N AHaH nodes on kT-RAM:

[latex isblock=true] kT-RAM = \{ A1, A2, A3, \ldots AN-1, AN \} [/latex]

Each AHaH node has created a linear decision boundary in its spike space. Thus, when a Feed Forward operation is executed on a particular pattern, `P`

, the signed output of these AHaH nodes represents a cluster.

For intance,

[latex isblock=true] activations = \{ 0.0, -0.1, 0.2, \ldots 0.1, 0.3 \} [/latex]

Is thresholded with a heavy side step function H() to attain a new spike pattern P.

[latex isblock=true] P = H(activations) = \{ 0, 0, 1, \ldots 1, 1 \} [/latex]

`P`

can be read in binary to attain a single integer cluster label.

## Discussion

The number of clusters is not predefined in this clustering method. Instead, the total number of possible output labels from the AHaH collective is given by `2^N`

, where `N`

is the number of AHaH nodes in the collective. This does not guarantee the collective outputs a total of `2^N`

unique labels since some AHaH nodes are likely to pick up the same independent components and find the same decision boundary. This can occur if the number of components `N`

is small and/or the number of patterns, `F`

, is high. However, as the number of AHaH nodes increases, the probability of this occurring drops exponentially.

We’ve found this method to be quite malleable. By separating our clustering approach and our encoding schemes we separate the notion of “similarity” from our clustering method. Instead, a practitioner can focus on creating his or her descriptive spike encodings before applying clustering on kT-RAM.

A KNN encoding method in conjunction with the above clustering method produces results similar to the algorithm K-Means. Likewise, a density based encoding can be used to create clusterings similar to a Density-based method like DBSCAN.

## The Partitioner

A simple implementation of the above method can be achieved with the Knowm API. Our implementation, contained in `Partitioner.java`

includes the bare bones operations for creating and training we need for kT-RAM clustering.

If you have the code you can use it to cluster spike patterns like this:

1 2 3 4 5 6 7 8 9 10 11 |
int numAHaHNodes = 10; // Number of AHaH nodes in our collective int spikeSpace = 100; // size of our spike patterns String id = "partitioner1" // Unique Id for our partitioner KTRAM ktram = new KTRAM21D( DigitalCoreType.BYTE ) // KtRAM object Partitioner spikeClusterer = new Partitioner( spikeSpace, numAHaHNodes, id, ktram) int[] exampleSpikePattern = new int[]{ 1,3,...,88}; int[] activeNodes = spikeClusterer.encode(exampleSpikePattern); int label = BinaryEncoder.encode(activeNodes); // extract integer label |

The class is available here: `org.knowm.knowmj.module.encoder._spikes`

.

If you look at the class heading you’ll notice is that the partitioner extends our generic encoder class. It just so happens these spikes will represent our clusters.

1 |
public class Partitioner implements Encoder |

One of the ways of reading this output is as group labels in binary format. We could also apply the outputs of the Partitioner into another Machine Learning module or into another partitioner.

We instantiate our partitioner as follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 |
/** * @param spikeSpace : size of each AHaH node * @param numAHaHNodes : number of AHaH nodes */ public Partitioner(int spikeSpace, int numAHaHNodes, String id,KTRAM ktram) { this.numAHaHNodes = numAHaHNodes; this.streamLength = streamLength; this.pId = id; this.ktram = ktram; this.joiner = new SpikeStreamJoiner(numAHaHNodes, 1); createAHaHNodes(); } |

As we can see, our partitioner is composed of a number of AHaH nodes. During clustering, each of these nodes is accepting the same spike patterns and performing an `FF-RU`

instruction. The partitioner thresholds these outputs and joins these together.

1 2 3 4 5 6 7 8 9 10 11 12 13 |
@Override public int[] encode(int[] spikes) { TIntHashSet active = new TIntHashSet(); for (int i = 0; i 0){ active.add(i); } } // Return all active AHaH nodes return active.toArray(); } |

## Further Reading

We’ve gone over kT-RAM clustering at a high-level and introduced the Knowm API code. The following article uses this class and a K-nearest neighbors encoder to demonstrate clustering on 2D distribution.

TOC: Table of Contents

Prev: Introduction to Clustering

Next: Clustering KNN Encodings

## Subscribe To Our Newsletter

Join our low volume mailing list to receive the latest news and updates from our team.