Share on Facebook0Tweet about this on TwitterShare on Reddit0Share on Google+0Share on LinkedIn0Share on StumbleUpon0Buffer this pageEmail this to someonePrint this page

At Knowm, we are building a new and exciting type of computer processor to accelerate machine learning (ML) and artificial intelligence applications. The goal of Thermodynamic-RAM (kT-RAM) is to run general ML operations, traditionally deployed to CPUs and GPUs, to a physically-adaptive analog processor based on memristors which unites memory and processing. If you haven’t heard yet, we call this new way of computing “AHaH Computing”, which stands for Anti-Hebbian and Hebbian Computing, and it provides a universal computing framework for in-memory reconfigurable logic, memory, and ML. While we have shown a long time ago that AHaH Computing is capable of solving problems across many domains of ML, we only recently figured out how to use the kT-RAM instruction set and low precision/noisy memristors to build supervised and unsupervised compositional (deep) ML systems. Our method does not require the propagation of error algorithm (Backprop) and is easy to attain with realistic analog hardware, including but not limited to memristors. This blog post and the research behind it is motivated by the fact that we need to compare our new approach apples-to-apples with existing deep learning approaches, looking at both primary metrics (accuracy, error, etc.) and secondary metrics (power, time, size).

Problems with Deep Neural Networks

Today’s deep learning models are neural networks, multiple layers of parameterized differentiable nonlinear modules that can be trained by back propagation of error.

  1. Requires massive amounts of labeled training data
  2. Requires extreme compute environments, limited primarily to behemoth companies/governments.
  3. Models are complicated and have high number of hyper parameters, making task an art rather than engineering

Geoffrey Hinton ML Pioneer Says We Need Another Approach

It’s always reassuring to hear other people in the ML community make statements that echo what we’ve been saying from the beginning!

My view is throw it all away and start again


In 1986, Geoffrey Hinton co-authored a paper that, three decades later, is central to the explosion of artificial intelligence. But Hinton says his breakthrough method should be dispensed with, and a new path to AI found.

Speaking with Axios on the sidelines of an AI conference in Toronto on Wednesday, Hinton, a professor emeritus at the University of Toronto and a Google researcher, said he is now “deeply suspicious” of back-propagation, the workhorse method that underlies most of the advances we are seeing in the AI field today, including the capacity to sort through photos and talk to Siri. “My view is throw it all away and start again,” he said.

The bottom line: Other scientists at the conference said back-propagation still has a core role in AI’s future. But Hinton said that, to push materially ahead, entirely new methods will probably have to be invented. “Max Planck said, ‘Science progresses one funeral at a time.’ The future depends on some graduate student who is deeply suspicious of everything I have said.”

How it works: In back propagation, labels or “weights” are used to represent a photo or voice within a brain-like neural layer. The weights are then adjusted and readjusted, layer by layer, until the network can perform an intelligent function with the fewest possible errors.

But Hinton suggested that, to get to where neural networks are able to become intelligent on their own, what is known as “unsupervised learning,” “I suspect that means getting rid of back-propagation.”

“I don’t think it’s how the brain works,” he said. “We clearly don’t need all the labeled data.”

Deep Learning Frameworks Review

Here, we reviewed most if not all the currently available deep learning frameworks as potential candidates for extension as well as comparison with our approach.

Framework Language Founder/Backer Description Github License CuDNN
TensorFlow Python Google Computation using data flow graphs for scalable machine learning github Apache Y
chainer Python Preferred Networks A flexible framework of neural networks for deep learning github MIT Y
Paddle C++/Python Baidu PArallel Distributed Deep LEarning github Apache Y
dsstne C++ Amazon Deep Scalable Sparse Tensor Network Engine (DSSTNE) is a library for building Deep Learning (DL) machine learning (ML) models github Apache N
CNTK C++/Python Microsoft Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit github MIT Y
Theano Python University of Montreal Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. github BSD Y
keras Python Deep Learning library for Python. Runs on TensorFlow, Theano, or CNTK. github MIT Y
Lasagne Python Lightweight library to build and train neural networks in Theano github MIT Y
blocks Python A Theano framework for building and training neural networks github MIT Y
h2o-3 Java/Python Open Source Fast Scalable Machine Learning API For Smarter Applications (Deep Learning, Gradient Boosting, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles…) github Apache Y
marvin C++ Princeton University A Minimalist GPU-only N-Dimensional ConvNets Framework github MIT Y
caffe C++ UC Berkeley a fast open framework for deep learning. github ? Y
mxnet Python/C++ Apache Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more github Apache Y
neon Python Intel Intel Nervana reference deep learning framework committed to best performance on all hardware github Apache Y
torch7 Lua Facebook Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. github BSD Y
pytorch Python Tensors and Dynamic neural networks in Python with strong GPU acceleration github BSD Y
caffe2 C++/Python Caffe2 is a lightweight, modular, and scalable deep learning framework. github BSD N
dynet C++ The Dynamic Neural Network Toolkit github Apache N
BigDL Scala Intel BigDL: Distributed Deep Learning Library for Apache Spark github Apache ?
systemml Java IBM SystemML is a flexible, scalable machine learning system. github Apache ?
mahout Scala The Apache Mahoutâ„¢ project’s goal is to build an environment for quickly creating scalable performant machine learning applications. github Apache ?
scikit-learn Python machine learning in Python github BSD N
leaf Rust Autumn Open Machine Intelligence Framework for Hackers. (GPU/CPU) github MIT and Apache Y
deeplearning4j Java Skymind Deep Learning for Java, Scala & Clojure on Hadoop & Spark With GPUs github Apache Y
jubatus C++/Python Framework and Library for Distributed Online Machine Learning github LGPL N

MNIST on Many Frameworks Comparison

After broadly reviewing all frameworks we narrowed our focus down to a short list including:

  1. Neural Networks and Deep Learning
  2. TensorFlow
  3. DL4J
  4. PyTorch
  5. CNTK
  6. Caffe2
  7. Torch7

In order to get a rough feeling for the various frameworks that we are interesting in leveraging for our own deep learning framework, we decided to get to know each framework from our short list by running the MNIST benchmark. We chose this benchmark because it’s one of the very first benchmarks that most people run as an intro to machine learning, the “hello world” of machine learning. There are many tutorials and help available. We will take a look the primary and secondary performance metrics, take additional notes along the way and rate each framework. We will run them on a Macbook Pro, and also on a Linux system with a GPU.

Neural Networks and Deep Learning by Michael Nielsen

This book does a wonderful job at teaching the concepts of neural networks and the back propagation of error algorithm. In later chapters it goes into deep learning. For most of the chapters there is code that you can look at and run. We used an updated version of the source code, adapted for Python 3.


Python et. al


By default, the 3rd network,, is run. This network is the convolutional deep neural network described in the book. If you need to run other networks, you’ll have to uncomment/comment the correct sections in


Deep Convolutional network: Input(28×28) ==> ConvPool(5×5,2×2) ==> ConvPool(5×5,2×2) ==> FullyConnected(100) ==> Softmax(10)

Time Accuracy Epochs Suffer Score
2 Hours 99.13 % 59 2/5

Suffer Score comment: Some Python-related errors needed to be dealt with.

Deep Learning 4 J

Deeplearning4j is a domain-specific language to configure deep neural networks, which are made of multiple layers. Everything starts with a MultiLayerConfiguration, which organizes those layers and their hyperparameters. Hyperparameters are variables that determine how a neural network learns. They include how many times to update the weights of the model, how to initialize those weights, which activation function to attach to the nodes, which optimization algorithm to use, and how fast the model should learn.


Java, Maven



Deep Convolutional network: Input(28×28) ==> ConvPool(5×5,2×2) ==> ConvPool(5×5,2×2) ==> FullyConnected(500) ==> Softmax(10)

Time Accuracy Epochs Suffer Score
30 Minutes 98.42 % 58 2/5

Suffer Score comment: Maven took a long time to download dependencies, was inconvenient to have to change the number of epochs.

TensorFlow (99.31%)

TensorFlow is an open source software library for numerical computation using data flow graphs. The graph nodes represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow also includes TensorBoard, a data visualization toolkit.


Python3, tensorFlow, etc.


Here, we run This network is a convolutional deep neural network.


Deep Convolutional network: Input(28×28) ==> Conv(5×5) ==> Conv(5×5) ==> Conv(4×4) ==> FullyConnected(200) ==> Softmax(10)

Time Accuracy Epochs Suffer Score
20 Minutes 99.31 % 20 2/5

Suffer Score comment: Had to fix one error related to Python.

Microsoft CNTK (?)

CNTK the Microsoft Cognitive Toolkit, is a framework for deep learning. A Computational Network defines the function to be learned as a directed graph where each leaf node consists of an input value or parameter, and each non-leaf node represents a matrix or tensor operation upon its children. The beauty of CNTK is that once a computational network has been described, all the computation required to learn the network parameters is taken care of automatically. There is no need to derive gradients analytically or to code the interactions between variables for backpropagation.


These instructions are for Linux because it apparently doesn’t work on MacOS.

Python3, etc.



MacOS: “ModuleNotFoundError: No module named ‘cntk'”.

Kubuntu LTS 16.04: “Import Error: cannot open shared object file: No such file or directory”.

OK, well this is turning in to a PITA. Didn’t work on Mac now there are issues with Linux.

Deep Convolutional network: Input(28×28) ==> ConvPool(5×5,3×3) ==> ConvPool(3×3,3×3) ==> Conv(3×3) ==> FullyConnected(96) ==> Softmax(10)

Time Accuracy Epochs Suffer Score
?? ?? % 40 5/5

Suffer Score comment: Git clone took forever. Turns out you cannot run CNTK on a Mac. There is a work around involving running a Linux container using Docker. The installation instructions for Linux are not straightforward. The command pip command is supposed to be pip3 for phython3.

Why does it need to be so complicated? In the end it didn’t work, so I’ve given up.

Torch (??)

At the heart of Torch are the popular neural network and optimization libraries which are simple to use, while having maximum flexibility in implementing complex neural network topologies. You can build arbitrary graphs of neural networks, and parallelize them over CPUs and GPUs in an efficient manner.

After at least one hour of googling, I was unable to find a tutorial or coherent instructions on how to install Torch7 and run a CNN MNIST demo. I opened an issue on Torch’s Google Group:!topic/torch7/5K_yS8Q2LIA.


Time Accuracy Epochs Suffer Score
?? ?? % 40 5/5

Caffe2 (??)

Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.

After at least one hour of googling, I was unable to find a tutorial or coherent instructions on how to install Caffe2 and run a CNN MNIST demo. The best I could find was installation instructions and a separate tutorial with lot’s of code but no instructions on how to download or run it.


Time Accuracy Epochs Suffer Score
?? ?? % 40 5/5

PyTorch (99.09%)

PyTorch is a Python based scientific computing package targeted at two sets of audiences: 1)A replacement for numpy to use the power of GPUs and 2)a deep learning research platform that provides maximum flexibility and speed.


Python3, tensorFlow, etc.


Here, we run mnist/ This network is a convolutional deep neural network.


Deep Convolutional network: Input(28×28) ==> ConvPool(5×5,2×2) ==> ConvPool(5×5,2×2) ==> FullyConnected(320) ==> FullyConnected(50) ==> Softmax(10)

Time Accuracy Epochs Suffer Score
30 Minutes 99.09 % 58 1/5

Suffer Score comment: The absolute least effort of all the frameworks.

MNIST Experiment Summary

Below is a summary of the short list MNIST experiments including time to run, accuracy and suffer score.

Framework Time Accuracy Epochs Suffer Score
neuralnetworksanddeeplearning 2 Hours 99.13 % 59 2/5
TensorFlow 20 Minutes 99.31 % 20 2/5
DL4J 30 Minutes 98.42 % 58 2/5
PyTorch 30 Minutes 99.09 % 58 1/5
CNTK 5/5
Caffe2 5/5
Torch7 5/5

After at least an hour of trying I completely gave up on CNTK, Caffe2 and Torch7. I’m sure other people with more experience in the technologies related to those frameworks could have got them running more easily than I did, but this experiment is from my perspective as a relative beginner with deep learning frameworks and limited background in Python and Lua, etc. My success or lack thereof for each framework reflects not only the code, but the documentation, cross platform compatibility and the availability of beginner tutorials to follow for MNIST. PyTorch turned out to be the absolute simplest to run, working right out of the box.

The model accuracies were more or less the same as expected. Future SWaP comparisons will probably done against TensorFlow, DL4J and PyTorch.

Share on Facebook0Tweet about this on TwitterShare on Reddit0Share on Google+0Share on LinkedIn0Share on StumbleUpon0Buffer this pageEmail this to someonePrint this page

Related Posts

Subscribe To Our Newsletter

Join our low volume mailing list to receive the latest news and updates from our team.

Leave a Comment


Subscribe to our low-volume mailing list to receive important updates and announcements directly in your inbox.