Frequently Asked Questions and Comments

“What you can imagine depends on what you know.” –Daniel C. Dennett

Additional Educational Resources

Reddit Forum

I’ve read your PLOS Paper but im lost. Could you distill AHaH Computing into a sentence or two?

There exists a computational building block or primitive that is formed of ‘energy dissipating pathways competing for conduction resources‘. It manifests at all scales of Nature, from neurons to river basins. You can realize this building block efficiently in electronic systems with memristors. Unlike traditional computing, this building block combines or ‘mixes’ memory and processing. We call the building-block an “AHaH Node”. Just as the transistor is the basis of many circuits, AHaH Nodes are the basis of many circuits. The paper shows how AHaH nodes can be used as the basis of general-purpose computing and machine learning. Since AHaH Nodes mix memory and processing, they are very efficient for memory-processing intensive operations like learning.

You can obviously call your project whatever you want but please don’t use the term ‘knowm’ to refer to the same phenomenon occurring in Nature. Please just use some descriptive term. I don’t think you guys deserve naming rights for this phenomenon.

We are truly sorry if how we have named things is upsetting. There are two issues. First, there is the name for the “thing in Nature”. Second, there is the name of our organization.

In regards to the ‘thing in Nature’, there are two problems with using a descriptive term. First, it is cumbersome in dialog. This leads to acronyms, which are also cumbersome and very confusing to newcomers. Second, when you are trying to build a technology/science around something like ‘Knowm’, descriptive terms are usually not quite right. We mean something specific with the word “Knowm”, and we believe ‘it’ needs a new name to distinguish ‘it’ from all the other terms people throw around which describe part, but not all, of what we are talking about. ‘It’ is not just a fractal, or something that resembles a tree or a neuron or X, Y or Z. A quick definition could perhaps be something like “the adaptive energy dissipation (living) structure found throughout Nature that is the result of vascularization”, but even this is not quite right. As we gain more knowledge of Knowm, we will ascribe more meaning to it. However, for this to happen, ‘it’ needs a name. Ideally a name with little to no scientific baggage.

We chose the word “Knowm” for the following reasons: (1) Its short, new and easy to remember with no pre-existing scientific associations. (2) It denotes ‘a thing’. Since we believe that Knowm’s are alive (yes, even rivers and lightning), it makes sense to talk about it like that. (3) The word rhymes with Ohm, which is the unit of resistance. Knowm is a flow-structure, and our work deals with utilizing memristors to mimic Knowm’s building block, something we call Knowm’s Synapse or Nature’s Transistor. (4) ‘know’ denotes knowledge. Our work primarily deals with building adaptive learning machines, so this fits. We also believe there is an intrinsic physical intelligence in knowms, as they actively search to find energy-dissipating pathways. We believe knowledge is the product of intelligence and related to this process. (5) Knowm sounds like the friendly fairy-tale characters called ‘Gnomes’, which are the protectors of Nature that live in trees. (6) Rhymes with the mantra ‘Om’, which is said to be “the sound of the universe” and has a significant spiritual meaning to many people. We have also found that many ascribe spiritual significance to Knowms. Since we see Knowm’s everywhere in Nature, to the point where it appears as if Nature may itself be built of them, we feel the name honors this spiritual or ‘holistic’ sense. To be clear, we are not in search of or proponents of spirituality. We are scientists and engineers in search of knowledge and technology who have a deep respect for both Nature and people.

As for naming our company after it, we felt that it made sense. We coined the name ‘Knowm’ long before we formed the company. Our affiliated company “KnowmTech” was formed in 2001, for example. Since our mission is to facilitate a greater understanding of Knowm (or whatever you want to call it!) and its applications to technology, particularly in neuromemristive processors, we feel the name is appropriate and we do not mean any disrespect.

Most of us at Knowm Inc agree with Nobel laureate physicist Richard Feynman: “I learned very early the difference between knowing the name of something and knowing something.” That is, a name has no real meaning in itself and ultimately does not matter much to understanding. It’s the meaning we ascribe to the name that matters. What really matters to us is how things work, not what things are called. We would very much like “it” (Knowm) to have a unique name so that we can ascribe meaning to it because “it”, in our opinion, is it’s own thing. Until ‘it’ has a unique name, it’s going to be hard to study ‘it’ because folks will not understand specifically what we are talking about. Hence we made up a name so we could get to work.

KT-RAM is in essence a memory storage system utilizing memristors to vary the signal response from a spike encoder. In what ways is KT-RAM different than normal RAM? More specifically, what makes KT-RAM so special?

Its an analog synaptic processor. It reduces synaptic integration and adaptation to analog operations on memristors, thus saving the considerable amount of energy required to shuttle multiple bits backand forth between memory and processing. Each “bit” in kT-RAM is a multi-bit analog synaptic weight thanks to the differential pair of memristors.

For any spike pattern, why are both forward and backward instructions necessary? (Over-saturation? Does this mean excess voltage is left in the network?)

It has to do with saturation of the differential memristors. The synapse is encoded as the difference in conductance between the two memristors that form the pair: Gs=Ga-Gb. If you only ever apply a positive bias, both memristors will saturate and your state is lost. Same thing if you only apply negative voltage. So pairing the instructions keeps things working. You could also utilize natural decay if the memristors are ‘volatile’. That is, you could drive the conductance higher and then wait while their conductance comes back down or normalizes. One way or another, you have to prevent saturation in the differential pair.

For a set of classifiers, the output seems to be a confidence level of 0 to 100%. Can the output provide multiple classifiers with each having a confidence level? (i.e. blue-green color or a mixed breed of dog.)

Absolutely. Given some spike stream (coming from feature learners), you can spin up an AHaH node (equal in size to the spike stream space) for each label. You can do this serially or in parallel, depending on the size and quantity of the cores.

I’ve been led to believe the process is read-write only. Does every memory retrieval cause a change in memristor conductivity? I thought a small enough voltage wouldn’t alter the resistance.

A small enough voltage will not alter resistance, depending on the physics of the specific memristors. The low-power solution to adaptive learning involves understanding how to build a system where the parts break, because if your voltage is very low (and hence you are consuming low power) and you want to adapt at the same voltage, then your synapses will become volatile because the barrier potential between states will be of the same order as random thermal energy. If you can repair this constant damage, you get the low-power adaptive learning solution. Like your brain right now, which is basically a big hunk of volatile pudding. Our current memristor technology does provide for a non-destructive read, but our methodology (AHaH Computing), solves for the more general case and gives us a scaling path to much higher levels of adaptive efficiency. (Nobody appears to understand this, BTW). So we could set the core voltage below the forward adaptation voltage of our BSAFW memristors, say .1V, and execute FF instructions to read without having to worry.

What is the relationship between KT-RAM and neural networks?

Neural Networks (the algorithms) are collections of linear neurons with non-linear activation functions that take real-valued inputs and multiply by real-valued weights. kT-RAM is a generic synapse resource that takes spike inputs (x=0 or x=1), and multiplies by real-valued weights to produce a real-valued output. (The spike-code is a hardware constraint, but AHaH nodes can in principle work with non-spike inputs as well.)

I understand how spike encoding works with 1 as a positive pulse and 0 as a negative pulse…

Thats not how spike encoding works! There is no such thing as “0 as a negative pulse”. Thats a binary code. An input spike has a spike or no spike (z). z means it is electrically floating, which cannot transmit a state. Note that AHaH nodes can work with binary codes, as well as continuous inputs. Spike codes are nice for a number of reasons, including that they map to address spaces perfectly and (so far as we know), are universal and can achieve the same results as other encodings.

I can understand how the universal logic can be created by NAND and XOR gates. I don’t know what memristors would do differently though.

The same AHaH circuit can come to represent any logic gate, except the XORs since they are non-linear and an AHaH node is a linear discriminator. However, you can combine multiple other logic gates together to achieve XOR and the same is true for AHaH attractor states. So while you would have had to hard-wire a logic gate in circuitry to be a NAND or XOR gate (or whatever else), you can make a memristor circuit that can become whatever gate you want via learning or programming. Unsupervised AHaH attractor states are universal logic functions and can thus be used as the basis of a computing fabric. If the logic function must be learned or later programmed, or if the input space is large (for example pattern recognition or inference), then AHaH nodes are an good option. If the logic function is set and never needs to be changed, then there is no reason for them–just use a dedicated circuit. The key features of AHaH nodes are reconfigurability and learning.

How can you create logic statements with AHaH Nodes? How does logic work?

please see the “AHaH Attractors Support Universal Algorithms” section of the AHaH Computing paper. Take some input state, lets call it “zero” or “false” or “state 1”, and assign it a spike pattern. This could be whatever you want. Do the same for another input state that you call “one” or “true” or “state 2”. The example used in the PLOS paper: Logic state–>spike pattern “true”–>[1,z] false–>[z,1] Note we could have other options, like: true–>[1,z] false–>[1,1] or: true–>[1,1,z] false–>[z,1,1] etc Combine the resulting spike patterns so they can be processed by an AHaH node. In our first example, we would need four inputs since each input required two lines. Giving the AHaH node the logic input “true-false” on logic input line 1 and 2, respectively, would yield the spike pattern: [1,z,z,1] Measure the output state (voltage) of the AHaH node. If it is a positive voltage, call that “true”. if it is a negative voltage, call that “false”. The AHaH Node is now a logic gate. It takes as input two logic states and returns a logic state. To find out what logic gate it is, you measure the output of the AHaH node for different input patterns. You can build up a truth table that shows the output for each input pattern. You will find that it obeys one of the 16 possible logic functions (for a binary two-input, one-ouput gate). You could also have a non-binary logic system, and construct spike-encoding for those states, make a truth table, and do everything the same. The attractor points of unsupervised AHaH plasticity are logic functions.

How exactly do you spike a node? Specifically how do you target an individual memristor instead of causing a current through the whole circuit?

kT-RAM has a RAM (SRAM) interface. Each synapse (memristor pair) is coupled to the driver and read-out electrodes via pass-gates. These pass gates are controlled via the state of the RAM bits. Once the synapse (or synapses) is/are coupled to the driver electrodes, the AHaH Controller can read out the state and apply feedback. All the other cells remain decoupled.

Is it correct to state that kT-RAM is limited to linear regression? Since the calculations are being made by a passive linear network of conductances, this seems to me to be the case. Or am I missing something here?

kT-RAM has a spike interface, so depending on what you mean by linear regression the answer is either yes or no. kT-RAM is an AHaH Node substrate (with a spike interface) and an AHaH Node is similar to a linear neuron with a sigmoid/tanh activation function (but not exactly). You can threshold the activation voltage of an AHaH Node, compare it to other node activation voltages (i.e. sort) or digitize it (which is expensive). You can do many things with AHaH Nodes, including non-linear classification, if you string them together in multiple stages. This is no different than standard neural networks. Read the PLOS Paper for a list of some stuff. An AHaH Node is a memory-processing primitive, and kT-RAM is a computing substrate. It is not an algorithm. Via the KnowmAPI we have been and continue to learn new ways to use it.

Power scaling in kT-RAM is worse than linear in the number of AHaH nodes and must be restricted to small configurations. How can this scale?

It really depends on what you mean by “large” and what you mean by “scale”. It is likely you are attempting to see kT-RAM as a whole solution rather than as a part of a solution. A 512 X 512 core could be used to emulate neurons with up to 262,144 synapses, which is about at the max end of biology if you look at Purkinje cells. However, that same core could be used to emulate 20 neurons with 13,107 synapses or 100 neurons with 2621 synapses. You could have neurons with various sizes, or you could go all the way down to the individual synapse. This flexibility has a trade-off in energy due to capacitive losses in the H-Tree. Simple optimizations for larger cores such as chokes help to reduce this capacitive loss, but this is an engineering problem just like everything else. The purpose of kT-RAM is to provide for a more flexible adaptive synaptic resource at the hardware level, embedded into larger architectures of various types, just like SRAM or various types of logic blocks are embedded into chip designs of various types. As for scaling, there is usually a trade-off in terms of flexibility and power when it comes to computing architectures. The modern CPU is a great example. It is a “jack of all trades and a master of none” and they have come to dominate the computing world. It is important to keep in mind two things. First, kT-RAM is an “adaptive synaptic resource” intended to be used as a co-processor within a variety of large-scale architectures, like a mesh-grid of cpu cores. Second, kT-RAM is just one possible implementation of AHaH nodes! Crossbars, for example, are another. Each has its own advantages and disadvantages, and depending on what you want to achieve, you should go with the best solution. Remember, it’s not possible to beat physics. But it is possible to clearly define what you want and then explore the space of possibilities that give you the best solution.

There’s no evidence that your single learning law combined with limited precision synapses and “spiking” neurons could get you anywhere close to state-of-the-art performance on benchmarks like Imagenet.

To correct the false-assumption, a chip capable of AHaH plasticity is not necessarily a chip constrained to learning in only one way. We have only come to this understanding through our work with AHaH Computing Learning algorithms become specific instruction set sequences or routines, where each operation results in Anti-Hebbian or Hebbian learning. At the lowest level, Anti-Hebbian just means “move the synapse toward zero” and Hebbian means “move it away from zero”. People have come to the misconception that we are only working with a ‘local’ or ‘fixed’ learning rule. On the contrary, we have defined an instruction set from which the local unsupervised rule (FF-RU for example) is just one possibility, which does not preclude global computations. We hold ML in very high regards, and it is the work of folks like Yann Lecun (and the others) that are pushing the boundaries, using the tools available to them. They are the current undisputed champions in primary performance benchmarking. We think the field of neuromorphics should re-align with primary performance benchmarking like ML has. As far as evidence that spikes and low-resolution synapse can work, it does beg the question of how our brain do it, seeing that they are (ultra efficient) spike-based networks with low-resolution synapses. From a hardware perspective, spikes and limited precision synapses make the most sense. From the algorithmic perspective, work like this, which is motivated by attempts to map algorithms to more efficient hardware, demonstrates that limited precision synapses can work better than full precision. Our results with the KnowmAPI support this. Our goal is to achieve ‘primary performance parity’ with state-of-the-art machine learning. We do not want to re-invent the field of ML–we want to achieve what it has already achieved in a much more efficient learning substrate. The massive strides being made in machine vision (and almost everything else) is wonderful, and we are watching closely what algorithms and approaches are working best. We will take what we can and port it over to AHaH Computing, and we will invent new approaches if needed. Rather than ignore the adaptive power problem, like others do, we are working on a solution.

How does kT-RAM compare to IBM’s SyNAPSE chip?

They are not actually the same thing. True-North is mesh of programmable digital cores that pass spikes around. kT-RAM is a specification for an adaptive synaptic resource intended to be used as a co-processor within a variety of large-scale architectures. For example, the SRAM inside the True-North core could be replaced with a kT-RAM core (or cores), and the result would be on-chip learning, more synapses, lower power and more flexibility at the core level, constrained to the specific large-scale topology of True-North (grid of cores). IBM would have to ditch most of their software and methodologies, so its arguable if it would be worth it instead of just building new large-scale architectures from scratch.

You talk about merging memory and processing, and that Nature does not do this. What about the summation/integration of the synaptically-scaled input spikes in a neuron? Doesn’t that have to be done in the dendrites and soma? Isn’t that a case of biological processing that’s spatially separated from memory?

Not really. That is a mathematical idealization of a neuron. Its actually more like that in kT-RAM than it is in a real neuron. Although I would say that this separation you are referring to is not like the separation in a digital computer. Also, you left out learning in this idealization. Just zoom into a (real) dendrite a little, or a (real) neuron, or anything (real). At some point you will see that the idealization of memory and processing being separate does not hold. To describe anything you will need both state information (physical objects or ‘memory’) and transformations (laws of physics acting on those objects). A neuron does not separate memory and processing and shuttle bits back and forth. It is a merging of memory and processing. A synapse is not memory and it is not processing — it is a merging of the two. A soma is not memory and it is not processing. Its a merging of the two. And so on.

Is there any method of saving and duplicating state with this technology other than retraining the network?

Our memristors are non-volatile, so state is already saved. In terms of state duplication or ‘programming’, there are multiple ways but two general categories. The first way is to use various techniques to program each each memristor into a defined resistance state. This is fairly time consuming if done electrically but a recent breakthrough in optical interface technology may greatly speed this up. The second way is through supervised learning. A ‘master’ acts as a teacher to a ‘student’, the result being a copy of neural state function. This method is faster than one might expect and has side benefits. First, this method does not require any additional interfacing circuitry. Second, since the student is learning it can adapt around faults in the hardware.

How does this technology handle power loss?

Knowm Memristors are non-volatile and hold their state.

What courses/subjects would you recommend for an undergrad with an interest in neuromorphic computing and AHaH?

AHaH Computing “crosses the technology stack”, so my advice is to learn as much as you can about the whole process of computing, from silicon wafers to real-world application development. I would then focus on one or two levels of the stack. Physics: Classical including Electricity & Magnetism and Thermodynamics. If you want to go into memristor device fabrication you will need Quantum and Physical Chemistry. In terms of Thermodynamics, be careful here. Be sure to check out some “fringe” stuff like Constructal Theory. The goal is to understand how life works as a physical process, and modern physics educations have very little if nothing to say about it. They can, however, give you solid foundations. Electronics: You could specialize here or, just get a good overview. Introduction to electronics is a must. After that you should know the basics of VLSI and how chips are made. Circuit design, both analog and digital. CS: Foundations in computing. You need to really understand how modern processors work, and how code gets turned into instructions that execute on processors. If you specialize here, do not forget about what is occurring at a physical level! Get really good at linux, and understand how the operating system connects to low-level peripherals and co-processors. Machine Learning: Overview of existing methods (decision trees/forrest, SVM, Neural Networks, Baysian Models, etc) and domains (perception, planning, control). Unsupervised learning. Focus on methods that have been commercialized in the real-world and try to understand why. If you specialize here, do not forget about what is occurring at a physical level in the computers! The machine learning community is notorious for simultaneously exploiting hardware accelerators while also deemphasizing their importance. They (for the most part) see hardware as something that should support their algorithms, not something that is intertwined with algorithms and physics. Neuroscience/Computational Neuroscience: Be very careful here! Use (biological) neuroscience as inspiration, but try not to fall into the “lets mimic” camp as a route to understanding learning algorithms. The brain is far too complex to focus on the minute details if your goal is useful technology. Many in the neuromorphic community have large blinders on in regards to real-world problem solving and the results are chips that are only useful in a narrow context. The ML community is better focused in this regard, but they in turn have a “hardware blindspot”. Rhetoric & Debate Learn how to communicate effectively with other people. Get good at recognizing logical fallacies. Technology is heavily intertwined with money and egos. For whatever reason, folks tend to get really stubborn and angry around ideas that threaten the status quo. They will react with arguments that on the surface may seem reasonable but after inspection are illogical. While technology is the foundation, our economy is driven by the interactions of people and human communications.

There have been dozens of attempts to produce “machine learning hardware” with FPGAs or ASICs, they always end up obsolete upon release thanks to Moore’s law.

Moore’s law does not resolve the memory-processing duality. It’s going to be us or somebody else, but the problem will be solved because a tremendous amount of energy is currently wasted shuttling information back and forth and this will not be resolved except through fundamental changes in computing architectures. At this point in time, much larger performance (speed, efficiency) increases can be achieved through code optimization and changes in processor architecture than through smaller transistors. Innovations are now being driven by alternative architectures. We are not aware of a more efficient architecture for synaptic integration and learning than one that eliminates the memory-processing duality and enables low adaptation voltages. Also, Moore’s law as we have all come to know it–is unambiguously dead.

The PLOS ONE article does a horrible job of explaining what sort of models you are aiming for!

AHaH nodes are building blocks. You use them to build models, and we have only just begun doing that. We are aiming for all sorts of models across perception, planning and control. We have shown decision trees, linear classifiers, fully-connected layers, combinatorial solvers, robotic arm actuators, reconfigurable logic and more. Our goal with the paper was not to show a model. It was to show a generic adaptive building block circuit that can be used to make many models.

Backprop or gradient descent is already known to be the best!

Backprop is a wonderfully useful algorithm and works well with current methods of computing. However, it is clear that backprop is not operating in the brain (at least in the way it is currently formulated mathematically) and hence there are clearly other methods available. When constraining solutions based on physics and circuits, so as to side-step the adaptive power problem, backprop becomes problematic. It is important to understand how intertwined machine learning is with the computational platforms that enable it. When the constraints of those platforms change, the optimal machine learning models change. Large GPU back-prop trained models today contain perhaps 100 million to 1 billion weights. The human cortex contains about 150,000 billion synapses and consumes about billion time less energy and space. That said, we’ve seen a few studies showing how to attain backprop operating in interesting ways that may be easy to port to kT-RAM and memristors in general, for example here. A number of groups are working on neuromemristive back-prop. Here is one of backpropagation’s inventors, Geoffrey Hinton, talking about back prop and STDP. Once we develop and perfect AHaH based methods that achieve primary performance parity with existing methods like backprop, Knowm will dominate in secondary metrics such as power efficiency, speed, and cost. At that point, backprop as it currently exists will be obsolete. Until then, memristors are likely still going to be the most efficient path to implementation of the backpropagation algorithm in hardware.

Am I wrong in thinking that the von Neumann bottleneck is the exact problem that memristors would solve?

You are correct. For spreadsheets and email apps, that architecture is fine. But for processing the vast amounts of data needed for real-time machine learning apps, it’s a serious problem. CPUs are being designed with memory closer to the processor, they are making multi-core processors, and there are new chips like Adapteva’s Epiphany. All of these tricks reduce the distance between processor and memory. Nature, on the other hand, uses a different approach to computing. It’s a system where the processor is the memory. The distance is zero. With memristors, the same can be achieved. One way to do it would be our proposed Thermodynamic-RAM. We write about it in a paper, which has a historical background section to give some perspective.

Just how efficient do you expect AHaH to be, for say, 100 trillion synapses? Our brains do it at about 20 watts, supercomputers would have it at several hundred megawatts, or even a gigawatt, what’s kT-RAM expected to be at?

Great question! There is a lot to this topic, but let me try to give you a short and simple answer. We are using differential memristors as synapses. The energy dissipated in a synapse is the power dissipated over both memristors for duration of the read/write pulses. $P = IV = V^2 / R$ , where V is the voltage and R the resistance. Typical on-state resistance of memristor are 500kOhm, and typical voltage is .5V, so: P=.5^2 / 500E3 = 5E-7 watts. The energy is only dissipated during read and write events, which occur in pulses of ~50ns (nanoseconds) or less. The energy per memristor per synaptic read or write event is then 5E-7W x 50E-9s = 2.5E-14 Joules. Since the kT-RAM instruction set requires paired read-write instructions, and since a synapse is two memristors, we multiple that answer by four: 1E-13. This is .1 pico-joules per adaptive synaptic event. Note we could lower the voltage and and pulse width to achieve even lower power for synaptic integration operations (i.e. no learning). Also note that capacitance plays a big role, so if your AHaH node is very large it will dissipate more energy as electrons get soaked up on the wires. If we say a human brain has 100 Billion neurons, each with 1000 synapse, that fire on average once per second, that is 1E14 adaptive synaptic events per second. The energy consumed in one second is 20 Joules. So if we put all energy into synaptic events we get 2E-13. The actual deployed power consumption of kT-RAM is dependent on what sort of computing architecture its embedded in. Very small cores are more efficient than larger cores for processing smaller spike streams. The purpose of AHaH circuit is to remove the memory-processing bottleneck for adaptive learning operations. If you knew exactly the connection topology of a brain and made a custom ASIC of AHaH nodes you are looking at efficiencies comparable to and possibly even exceeding biology (eventually). The reason for this is that our modern methods of chip communication can be quite a bit more efficient (and faster) than biology in some respects. However, if you have a more generic architecture that enables you to explore more connection typologies, for example a mesh grid of little CPUs with local RAM and kT-RAM, you would expend more energy but get something more flexible: The ability to emulate any brain, not just one specific type.

From your point of view, do you think there could be a sudden leap in computing technology that will render the current “top-of-the-line” consumer-level computing (high-end gaming graphics cards like the GTX980, new i7 processors, etc) obsolete in a very short period of time?

No. GPUs and CPUs are absolutely great at what they were designed to do. G stands for ‘Graphics’, after all! They will likely be around for a long time. There will be new types of hardware available that will dominate applications in machine learning and give the everyday consumer access to an entire new range of applications and abilities that are not possible right now. We recently proposed one such device that would plug into existing computer platforms and allow for many exciting new possibilities. It’s called Thermodynamic-RAM.

What do you think about Consciousness?

We really don’t know. Its an important question, and we have theories about what attention is, but our main focus is in solving well-specified real-world problems.

How closely does your approach mimic biology?

We are not specifically focused on biology. Rather, we are interested in natural self-organization and how to harness it. We believe that biology has ‘hijacked’ or ‘co-opted’ a more fundamental process of energy attempting to dissipated itself through adaptive containers. Ultimately we are trying to solve problems, not build brains. Our primary concerns are solving real-world problems, and although biology is a big source of inspiration, our technology is not going to look much like biology.

What do you mean by kT-RAM being capable of ‘healing’?

AHaH Nodes are adaptive. The act of learning creates attractor states that repair state information. The act of using kT-RAM heals it, because kT-RAM can learn around faults. Some synapses could stop working, inputs could turn off or become chaotic, etc. Learning will constantly tune the functioning synapses.

Will the transition from simulated kT-RAM to actual kT-RAM do anything other than increase efficiency?

A star-gate will open up and confetti will fall from the sky. But seriously, physical kT-RAM is all about power and space efficiency.

How is your current machine learning approach on kT-RAM different from a traditional neural net?

kT-RAM is an “AHaH Node resource”. Anti-Hebbian and Hebbian (AHaH) nodes can be used for things like feature learning, classification and combinatorial optimization. You can think of an AHaH Node as a ‘neuron’ and you can make networks out of them and you can solve the same sorts of problem that neural networks solve. But you can also do more basic stuff like logic, memory, random number generation and set iteration. AHaH Nodes can be used in many way and we have just started trying various things out. We believe AHaH nodes are universal adaptive resources, and our success to date has given us confidence that a great number of effective and efficient solutions can be attained with them.

11 Comments

Simon Reichlin
2015-07-09 at 7:47 AM

Hi dear Knowm Team,

I’m a tech freak from switzerland and read an article about your work today. I guess your on the right path to develop human future. Please submit your PayPal account, because i really would like to support you and your work.

Best Regards
Simon Reichlin

Reply

Alex Nugent
2015-07-12 at 3:41 PM

Simon–Thank you for your support! We have set up a donations page. Please let us know if you have any problems.

Reply

Noah Gundotra
2015-09-04 at 5:10 AM

Hi Knowm,
I’m a high school student really intrigued by your work, and I was wondering if you could set me on a path for learning more about AHaH computing, like books or videos.

I’m currently enrolled in a Stanford online Machine Learning course, and I’d like to learn more about the implications of your tech.

Thanks!
Noah

Reply

Alex Nugent
2015-09-04 at 3:20 PM

Noah, Thanks for your interest! AHaH Computing is so new, there are no public text books! We currently have a series of online lectures for members of the KDC, which we are expanding. We offer bounties in place of the fee. Basically if you do something to help the community, you can get in. If you are interested in Machine Learning and currently taking the Stanford Machine Learning course, I would say that is a good place to start! Our goal with AHaH Computing and kT-RAM is to achieve “primary performance parity” with state-of-the-art machine learning, which we have done already in some areas. The implications are extremely efficient learning processors and a path to biological-scale efficiency.

Reply

Knowm.org | AHaH Computing in a Nutshell
2015-12-05 at 11:10 PM

[…] FAQ […]

Reply
Rolf Kickuth
2016-03-27 at 4:30 PM

Hi Alex, I am a scientific journalist in Germany, following AI et al. since more than 25 years. What still is not clear to me: Do the chips that you sell really work with memristive materials, or do they simulate memristors with standard cmos-technology? If they consist of memristive materials: How long is their operation time? I think as one is working more with ions instead of only electrons there are side effects which will lead to a malfunction after some time. If they do not consist: When will the memristive material be ready? HP estimates concerning its own materials about 2019… By the way your algorithm resembles me to Simulated Annealing ANNs. Is that right? Best regards Rolf

Reply

Alex Nugent
2016-03-27 at 6:14 PM

Do the chips that you sell really work with memristive materials, or do they simulate memristors with standard cmos-technology?

The chips are actual memristors with a memristive material. You can read the datasheet here.

How long is their operation time?

Only time will tell! What we can say is that they can be electrically cycled billions of times, they can hold resistive state and be cycled at high temps (>140C). Early devices I received from Dr. Campbell over two years ago still work just fine. The physical structure/operational theory of the devices give me confidence that they will age well. But of course, the only experiment to resolve that question is time itself. Dr. Campbell reports that there is “No reason to assume it wouldn’t work in ten years. Different devices that I’ve tested after thirteen years still work fine.”

When will the memristive material be ready?

Depends on your use case. I am far more interested in incremental synapses than bits, for example, and the required operational characteristics of the devices for each use case is different. If you want to know if the devices are ready for your particular application (memory, logic, learning, filters, oscillators, etc), we offer packaged devices, research die, and data. We offer a BEOL service that works with CMOS fabs to add memristors to client designs, and we are currently working with clients.

By the way your algorithm resembles me to Simulated Annealing ANNs. Is that right?

Not sure what algorithm you are talking about, so I can’t say. AHaH Nodes are computing primitives. They can be bits, logic functions, classifiers, feature learners or even random number generators. kT-RAM is an AHaH node substrate. I have explored many ways of using AHaH nodes, and you could call each method “an algorithm”. I suspect that, like logic, there are an endless number of algorithms that could be built from AHaH nodes. That said, there are elements of what I have published that definitely resemble simulated annealing.

Reply

Rolf Kickuth
2016-04-04 at 10:36 AM

Hello Alex, the Knowm memristor chips contain 8 memristors per chip. One advance of memristors shall be that they can be produced VERY small. Will there be memristor chips from Knowm in the near future which contain millions of memristors?

Reply

Alex Nugent
2016-04-11 at 5:25 PM

Will there be memristor chips from Knowm in the near future which contain millions of memristors?

Yup.

Reply

Andras Kovacs
2016-11-20 at 8:56 AM

Hi Alex,

another book which has lots of relevance to neuromemristive processing is Valentino Braitenberg\’s \”Vehicles: Experiments in Synthetic Psychology\” (MIT Press 1984)

https://www.amazon.com/Vehicles-Experiments-Psychology-Valentino-Braitenberg/dp/0262521121/ref=sr_1_1?s=books&ie=UTF8&qid=1479632151&sr=1-1&keywords=valentino+braitenberg

Reply
juan-luis del valle
2017-06-18 at 9:29 PM

Hello Alex
I understand from your video lessons on AHaH computation, that neurons are abstracted as digital arithmetic logic units and are identified as AHaH nodes.
I have trait to figure out an knee jerk or stretch reflex (Introduction to Neurons and Neuronal Networks John H. Byrne, Ph.D., Department of Neurobiology and Anatomy, The UT Medical School at Houston.) using AHaH nodes.

Alex, could advise me on what I need to learn to perform this kind of emulation using the KNOWM Discovery Memristor item?

Alex, could advise me on what I need to learn to perform this kind of emulation using the KNOWM Discovery Memristor item?

Reply

Frequently Asked Questions and Comments

Additional Educational Resources

Reddit Forum

I’ve read your PLOS Paper but im lost. Could you distill AHaH Computing into a sentence or two?

You can obviously call your project whatever you want but please don’t use the term ‘knowm’ to refer to the same phenomenon occurring in Nature. Please just use some descriptive term. I don’t think you guys deserve naming rights for this phenomenon.

KT-RAM is in essence a memory storage system utilizing memristors to vary the signal response from a spike encoder. In what ways is KT-RAM different than normal RAM? More specifically, what makes KT-RAM so special?

For any spike pattern, why are both forward and backward instructions necessary? (Over-saturation? Does this mean excess voltage is left in the network?)

For a set of classifiers, the output seems to be a confidence level of 0 to 100%. Can the output provide multiple classifiers with each having a confidence level? (i.e. blue-green color or a mixed breed of dog.)

I’ve been led to believe the process is read-write only. Does every memory retrieval cause a change in memristor conductivity? I thought a small enough voltage wouldn’t alter the resistance.

What is the relationship between KT-RAM and neural networks?

I understand how spike encoding works with 1 as a positive pulse and 0 as a negative pulse…

I can understand how the universal logic can be created by NAND and XOR gates. I don’t know what memristors would do differently though.

How can you create logic statements with AHaH Nodes? How does logic work?

How exactly do you spike a node? Specifically how do you target an individual memristor instead of causing a current through the whole circuit?

Is it correct to state that kT-RAM is limited to linear regression? Since the calculations are being made by a passive linear network of conductances, this seems to me to be the case. Or am I missing something here?

Power scaling in kT-RAM is worse than linear in the number of AHaH nodes and must be restricted to small configurations. How can this scale?

There’s no evidence that your single learning law combined with limited precision synapses and “spiking” neurons could get you anywhere close to state-of-the-art performance on benchmarks like Imagenet.

How does kT-RAM compare to IBM’s SyNAPSE chip?

Is there any method of saving and duplicating state with this technology other than retraining the network?

How does this technology handle power loss?

What courses/subjects would you recommend for an undergrad with an interest in neuromorphic computing and AHaH?

There have been dozens of attempts to produce “machine learning hardware” with FPGAs or ASICs, they always end up obsolete upon release thanks to Moore’s law.

The PLOS ONE article does a horrible job of explaining what sort of models you are aiming for!

Backprop or gradient descent is already known to be the best!

Am I wrong in thinking that the von Neumann bottleneck is the exact problem that memristors would solve?

Just how efficient do you expect AHaH to be, for say, 100 trillion synapses? Our brains do it at about 20 watts, supercomputers would have it at several hundred megawatts, or even a gigawatt, what’s kT-RAM expected to be at?

From your point of view, do you think there could be a sudden leap in computing technology that will render the current “top-of-the-line” consumer-level computing (high-end gaming graphics cards like the GTX980, new i7 processors, etc) obsolete in a very short period of time?

What do you think about Consciousness?

How closely does your approach mimic biology?

What do you mean by kT-RAM being capable of ‘healing’?

Will the transition from simulated kT-RAM to actual kT-RAM do anything other than increase efficiency?

How is your current machine learning approach on kT-RAM different from a traditional neural net?

Simon Reichlin

Alex Nugent

Noah Gundotra

Alex Nugent

Knowm.org | AHaH Computing in a Nutshell

Rolf Kickuth

Alex Nugent

Rolf Kickuth

Alex Nugent

Andras Kovacs

juan-luis del valle

Leave a Reply to Alex Nugent Cancel reply