Microsoft Wins ImageNet Using Extremely Deep Neural Networks
Written by MIke James   
Tuesday, 15 December 2015

While just about everyone else is forming foundations and institutes to further AI, some researchers are actually getting on with doing it. This year's ImageNet competition has been won by Microsoft, which comes as something of a surprise. 


imagenetIt is a surprise because overall it is Google that makes the most noise about AI and in the popular mind at least Google is miles ahead of the competition. In truth all of the big companies engaged in the race to bring AI to the masses are really just fine tuning the same basic approach to the problem - the Deep Neural Network.

So how did Microsoft do it?

The main ImageNet competition is just about who can turn in the best, i.e.. lowest, error rates on a 100,000 photo database classified into 1000 object categories. A side task it to locate the object in the picture. Microsoft manages an error rate of 3.5% and a localization error of 9%. Google's previously winning network turned in a similar figure for the error rate but for localization the difference was larger with a 19% error. 


In previous years neural networks with 30 or so layers came in first. This year the same neural network approach yielded improvements by going deeper. Microsoft's network was really deep at 150 layers. To do this the team had to overcome a fundamental problem inherent in training deep neural networks. As the network gets deeper training becomes more difficult so you encounter a seemingly paradoxical situation that adding layers makes the performance worse.

The solution proposed is called deep residual learning. While the general idea of deep residual learning is motivated by reasonable assumptions, it seems that the reason it actually works is still vague.

The idea is that if an n-layer network learns a task reasonably well, adding more layers should produce at least as good a performance - because that's what you get if the extra layers are set to the identity transformation.

The proposed method changed the learning task to make it easier for the standard learning algorithm to learn an identity transformation. Of course, in practice it is unlikely that an identity transformation is optimal, but the method seems to work more generally and finds better solutions.

To quote from the paper explaining the work:

"In real cases, it is unlikely that identity mappings are optimal, but our reformulation may help to precondition the problem."

The new architecture can be implemented using existing systems and the team even explored even deeper networks - up to 1000 layers - but the results weren't as good presumably due to overfitting. For this size of model the dataset was comparatively small. 

So it seems we are entering the era of not just Deep Neural Networks but of Extremely Deep Neural Networks. 

One of the recurring themes of the development of neural networks, a point often made by Geoffrey Hinton is that we have had the answer all along. The neural network invented back in the 1970s was just not deep enough. Since then each breakthrough has involved finding ways of effectively training ever deeper networks - and so the trend continues. 


More Information

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Related Articles

The Flaw In Every Neural Network Just Got A Little Worse

The Deep Flaw In All Neural Networks 

The Flaw Lurking In Every Deep Neural Net  

Neural Networks Describe What They See       

Neural Turing Machines Learn Their Algorithms       

Learning To Be A Computer       

Google's Deep Learning AI Knows Where You Live And Can Crack CAPTCHA

Google Uses AI to Find Where You Live      

Deep Learning Researchers To Work For Google

Google Explains How AI Photo Search Works

Google Has Another Machine Vision Breakthrough?

The Triumph Of Deep Learning


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin



Couchbase C++ SDK Goes GA

Accessing and working with Couchbase from C++ is now possible, thanks to the release of a C++ SDK that provides integration with Couchbase Server.

Udacity Offers More AWS Scholarships

Udacity has announced it is accepting applications for the next wave of 1,000 AWS AI & ML Scholarships. Any student over the age of 16 who self-identifies as under-served or under-represented in t [ ... ]

More News


kotlin book



or email your comment to:



Last Updated ( Tuesday, 15 December 2015 )