The Oxide-Memristor Paradox
Since launching Knowm Inc four years ago there has been a perplexing paradox that I believe has finally resolved itself. Soon after publication of AHaH Computing: From Meta-Stable Switches to Machine Learning I had the great fortune to meet Dr. Kris Campbell, who was exploring a new type of device that we now call a Self Directed Channel memristor. It caught my eye because it has all the properties we needed to realize machine learning in intrinsically adaptive hardware. It has the ability to hold a continuum of resistance states. It operates at the right resistance ranges for low power inference operations. Its resistance can be incremented and decremented with pulses. It is easy and inexpensive to manufacture at high yields and it can be deposited on CMOS circuits. It can even operate at extreme temperature ranges indefinitely. After more than a decade, we finally had the technology to realize our dream of self-organizing circuits!
Building fundamentally new learning processors from novel nano-tech devices is of course an expensive endeavor, so I began by talking with potential investors and applied for some state grant funding. In most cases I received frustrating questions that I could not, at the time, answer. Why is Knowm’s memristor any better than HP’s memristor? If HP can’t do it, how can you? Everybody knows from HP that memristors are a failed technology.…and so it went.
Disregarding all the know-how and foundational circuit-level IP we have acquired over the years, such as differential-pair memristor synapses, this was frustrating because they were good questions and I did not have good answers. If the engineers at HP failed, even with the extraordinary resources and expertise available to them, how could we expect to fair better? My inability to answer these questions in a direct and logical way led us to take baby-steps toward building a memrisitive machine learning processor and focus more on the SBIR/STTR work we were doing with DoD sponsors. With R&D as our primary focus, we sold our memristors on the side, and focused on small proof-of-concept circuits and extensive software developments.
My inability to answer these questions in a direct way led us to take baby-steps toward building a memrisitive machine learning processor and focus more on the SBIR/STTR work we were doing with DoD sponsors.
There were signs back then that oxide memristors had problems. A report generated by Sandia National Labs in 2011 (no longer available to the public) indicated that a shockingly small percentage (~10% if I recall correctly) of the devices on 32X32 crossbars were functional, and an even smaller fraction of the devices, something like 2%, were considered ‘nominal’. My conclusion at the time was that HP was obviously suffering from yield issues and that non-volatile RAM was just not the right application for oxide memristors. If they switched focus to machine learning applications, which are much more tolerant to yield issues because circuits can adapt or learn around faults, their problems would be alleviated. Although it’s obvious to me now that a 10% yield indicated another problem, I simply did not catch on. It turns out that there was a problem with HP’s oxide memristor. I believe it is a problem of oxide memristors in general.
The issue with oxide memristors is more interesting than I could have imagined at the time—and it’s a problem that is occurring even today as I write this. Major funding agencies and a lot of smart people keep falling into the same ‘oxide trap’. It would be funny if it were not so serious. Let us go back and review the facts.
HP announced in 2008 that they would commercialize their revolutionary oxide memristors by 2011, yet they supplied memristor crossbars to Sandia National Labs in 2011 with only 10% yield. Let’s assume that HP process engineers are smart people and know what they are doing (this is a good assumption I believe), and that other than operating under extreme pressure from upper management to deliver quickly, they know how to fabricate and measure the yields of their wafers. Logic would dictate that measured yields by HP in both 2008 and 2011 would be better than 10%. No process engineer in their right mind would announce to the world the arrival of memristive RAM in only three years when their existing process was only yielding 10%. It’s now 2019, more than a decade later, and while oxide memristors are still being developed and improved upon in the literature they have still not made it into even small-scale commercial production. So what’s going on?
The answer has only become obvious to me in the last year. As part of SBIR research work with DoD sponsors we have been developing a software back-end for memristor-based machine learning and developed the Memristor Discovery USB platform. Memristor Discovery was intended as risk reduction proof-of-concept test board for differential-pair memristor synapses ahead of CMOS integration, but we realized others may find it useful for education and endurance testing and so we offer it for sale on the web-store. After demonstrating machine learning classification was possible with Knowm SDC memristors and the Memristor Discovery platform the next step was CMOS integration and the first generation of synaptic arrays. Unfortunately we ran into a big problem: Our DoD sponsors were well underway funding multiple groups working on oxide memristors and they did not appear to have the desire or ability to sponsor Knowm SDC memristor chip fabrication. This was unfortunate for us, but it’s understandable. Getting the funding, equipment and expertise to build memristive processors is extremely difficult, and given the checkered history with HP the perceived risk is high. Our role within the larger DoD effort appeared to be confined to software development. I did not dedicate my life to this work to write software, however. I intend to build a chip.
I learned that a colleague was able to get her hafnium oxide memristor test circuits on a wafer run for free. It appeared that perhaps the bus was leaving the station, so we’d better get on.
While attending the NICE conference last year I learned that a colleague was able to get hafnium oxide memristor test circuits on a wafer run for free (it pays to be well connected!). It appeared that the bus was leaving the station, so we better get on. I figured the best way forward was to work with the hafnium oxide process currently being funded and try to get some of our test circuits on a wafer run. Having just developed the Memristor Discovery platform, the first obvious step was to compare discrete hafnium oxide memristors with Knowm memristors–so I asked if we could get some. Packaged or raw die, I was flexible. I just needed to test the memristors for myself to ensure our existing models would work. But I could not get any samples to test, even after asking repeatedly. I found this strange, so I reached out to Dr. Campbell, inventor of the SDC memristors, and she reported the same problem:
“While I’ve been trying for over the past decade to get my hands on [third party] oxide memristors to test, I have been unable to. The literature is full of different types of oxide memristors, all claiming to work very well, but I have been unsuccessful in obtaining them.”
I spoke with another of the hafnium oxide circuit designers waiting for their million dollar wafer to come back from the fab. I asked if they had done any testing with discrete devices before jumping into full wafer fabrication. He said, to my surprise, that they had not. They were just working with the SPICE models given to them by the foundry partners. I guess I am old-school, but when it comes to circuit design with fundamentally new circuit components, jumping straight to a million dollar wafer fabrication run before testing basic assumptions and device models with discrete devices seems extremely risky. That was the whole point of Memristor Discovery. The only reason for such a thing, I imagine, is that functional discrete devices were not available. One has to wonder why.
..the idea that one would jump straight to a million dollar wafer fabrication run before testing basic assumptions and device models with discrete devices seems extremely risky.
While wrapping up our SBIR program for DoD sponsors to develop a memristor compliant general anomaly detection system with a focus on fixed security cameras, I was asked briefly a question about Knowm memristors that stuck with me: “Do they have any shelf-life issues?”. A few months later, while attending the CCC Thermodynamic Computing workshop in Hawaii (a very odd conference worthy of its own blog) and speaking with a researcher I got the same question: “Do your memristors have any shelf-life issues?”.
Knowm Inc has now sold SDC memristors to researchers in over 40 countries. I personally test chips after packaging, and periodically after that. I still have the very first chips we ever packaged (they have considerable sentimental value to me)–and there has been no observable degradation in their operation. After the Thermodynamic Computing workshop and these multiple off-hand question it all started to make sense. I reached out to Dr. Campbell to get her opinion:
There are many reasons that shelf-life could be a problem with oxide-based memristors, however, the bottom line is that the device materials are moving towards a more stable energy state.
“Personal experience has taught me that oxide devices can be very tough to fabricate consistently due to the lack of complete control over fabrication conditions and exposure of the active material to an atmosphere that is not a vacuum, which is not an easy condition to avoid. The oxide-based memristor devices my team has fabricated have occasionally displayed respectable initial switching from a few devices on a wafer, and we could have cherry-picked those devices to present admirable results to the world via a journal publication. However, the folks that have made oxide memristors know of the device-to-device and wafer-to-wafer variability that are preventing the oxide memristor from being incorporated into a commercial application over the past decade. Even more, the devices that worked well on the first day out of fab may have worked poorly the following week. That’s not usually in the journal papers. There are many reasons that shelf-life could be a problem with oxide-based memristors, however, the bottom line is that the device materials are moving towards a more stable energy state. This can occur through oxygen migration within the device, oxygen incorporation/diffusion from ambient environment into the device (sidewalls, penetration of electrode materials, etc.), reduction of defects, a change in the localization of states, dipole relaxation, to name a few.” –Dr. Kris Campbell
After four years I finally have a good answer to the oxide memristor paradox. The problem is that oxide memristors have shelf-lives long enough to fabricate, test and publish—but not long enough to commercialize. I call this the “oxide trap”. In our crazy competitive world, the idea that one would wait around for a year or more with devices just siting on the shelf before publishing is totally absurd! Publish or perish, as they say. Or perhaps better stated: Publish before they perish.
The problem is that oxide memristors have shelf-lives long enough to fabricate, test and publish—but not long enough to commercialize. I call this the oxide trap.
So we now have a logical explanation for how HP got itself into trouble: they attempted to rush to market with what appeared to be a great device before waiting for shelf-life tests to come back. The logical reason it’s hard or impossible to obtain discrete oxide memristor chips to test, and why researchers still cannot buy discrete devices, is because they spoil too quickly. I do not believe HP knowingly gave Sandia National Labs memristor crossbars with 10% yield. My bet is that those devices looked great under the probe-station after fabrication. But then they sat on the shelf before being shipped to Sandia, where they again probably sat there on the shelf. By the time Sandia engineers measured them they had spoiled.
If any property is more important for commercialization, it is shelf-life. We tend to assume good shelf life in the electronics world, and in the cases where we must deal with it the culprit is almost always oxidation (for example in oxidation of Cu wires in chip wire bonding). There is also no other property that is more at odds with the realities of the fast-paced competitive research world we live in than shelf-life.
The very act of verifying shelf-life goes against the incentives in the research and commercial world to act quickly. Nobody has the time to wait on shelf-life tests when everything else looks good. After all, another researcher may beat you to publication. When you understand the problem you can see how devious it is! While shelf life may be the single most important property for a memristor to have, there are strong incentives to not measure or report it.
I have another word for “metal-oxide”. I call it rust.
I have another word for “metal-oxide”. I call it rust. Rust is easy to make, but it can also be hard to stop. This explains the abundance of oxide memristors in the literature and the paradoxical unavailability to purchase oxide memristors commercially. It’s difficult to remove oxygen from the environment because people need to breath. Over time oxygen seeps into the devices and, like oxide formation in many other industries, causes the parts to spoil. There is literally no other material more indicative of a ‘dead machine’ than rust.
Oxidation and its relation to shelf-life extends to other industries, most notably the food industry. When food goes rancid and smells bad it’s due to lipid oxidation. Indeed, resistance to oxidation is used by the food and oil industries to predict shelf-life! There is a name for it. It’s called Oxidative Stability. Go figure.
…oxide memristors should be assumed to have poor shelf-life unless proven otherwise.
A decade after the now infamous oxide memristor announcements by HP, we can answer the question as to why oxide memristors smell so fishy. It’s because they oxidize. Indeed, at this point oxide memristors should be assumed to have poor shelf-life unless proven otherwise.
If you are an oxide memristor stake holder, make sure you know the oxidative stability of your memristors to avoid being caught in the oxide trap. Just because a wafer has high yield now does not mean it will have the same high yield next month. If you are a memristor circuit developer, insist on discrete memristor test chips instead of mathematical models so you can test the devices for yourself before jumping to integrated circuits. Finally, you should consider going with non-oxide memristors like Knowm’s SDC memristors. They now have a proven commercial shelf-life. Indeed, I invite you to take a Knowm memristor chip and do what I did: place it in a 100% humid steam bath at 175F for a week and see if you can break them. The chip I tested came out unscathed.
I call this the oxide trap because we are victims. It’s just one of a number of important lessons we have to learn on our path to the memristor revolution. I sincerely hope my words will be taken as constructive criticism and bring up an important issue that has the potential to sink memristor science if it is not openly discussed and remedied. I also encourage all memristor developers to make your devices available to independent third-party testers for proper evaluation that includes oxidative stability and shelf-life.