There is a lot of activity in the neural network hardware space. Intel just bought Nervana for $400m and Movidius for an undisclosed amount. Both make dedicated silicon to run and train neural networks. Most other chipset vendors I have talked to are similarly interested in adding direct support for neural networks to future chips. I think there is some risk in this approach. Here is why.
Most of the time executing a neural network is spent in massive matrix operations such as convolution and matrix multiplication. The state of the art is to use GPUs to do these operations because GPUs have a lot of ALUs and are well optimized for massively data parallel tasks. If you spent time optimizing neural networks for GPUs (we do!), you probably know that a well optimized engine achieves about 40-90% efficiency (ALU utilization). Dedicated neural network silicon aims to raise that to 100%, which is great, but in the end a 2x speedup only.
The problematic part is that chipset changes have a long lead time (2-3 years), and you have to commit today to the architectures of tomorrow. And thats where things get tricky. A paper published earlier this year showed that neural networks can be binarized, which reduces the precision of weights to 1 bit (-1, 1). Slow and energy inefficient floating point math turns into very efficient binary math (XNOR), which speeds up the network on existing commodity silicon by 400% with a very small loss in precision. Commodity GPUs support this because they are relatively general purpose computers. Dedicated neural network silicon is much more likely designed for a specific compute mode.
In the last 12 months or so alone we have seen dramatic advances in our understanding how to train and evaluate neural networks. Binarization is just one of them, and its fair to expect that over the next 2-3 years similarly impactful advances will be published. Committing to hardware designs based on today’s understanding of neural networks is ill advised in my opinion. My recommendation to chipset vendors is to beef up their GPUs and DSPs, and support a wide range of fixed point and floating point resolutions in a fairly generic manner. Its more likely that that will cover what we’ll want from silicon over the next 2-3 years.