Must Read Research Papers on Machine Learning

Machine Learning is one of the hottest Research domains in Computing. These are the must read Research Papers on Machine Learning which has revolutionized the domain of ML:

  1. Optimal Brain Damage
  2. All You Need Is A Good Init
  3. Reducing the dimensionality of data with neural networks
  4. ImageNet Classification with Deep Convolutional Neural Networks
  5. Dropout: a simple way to prevent neural networks from overfitting
  6. Learning internal representations by error-propagation
  7. Efficient backprop
  8. Generative adversarial nets
  9. How transferable are features in deep neural networks?
  10. Very deep convolutional networks for large-scale image recognition
  11. Intriguing properties of neural networks
  12. Generating text with recurrent neural networks

Optimal Brain Damage

The research paper titled "Optimal Brain Damage" is one of the most influential research papers in Machine Learning. It gave rise to the concept of Pruning.

This is based on the idea that if some of the weights of a Neural Network is destroyed then, the Neural Network works better along with a smaller size. The corresponding idea is that even if a part of human brain is damaged, its functionality is not reduced.

Paper: Optimal Brain Damage
Authors: Yann Le Cun, John S Denker and Sara A Solla
Affiliation: AT&T Bell Laboratories
Published: 1989
Citations: 4021
OpenGenus rating on Influence: 9.4

All You Need Is A Good Init

This paper is influential in the sense that it shows that initialization of weights is more important than other steps. In short, the starting point should be crafted carefully as it directly impacts the success of the Neural Network.

Paper: All You Need Is A Good Init
Authors: Dmytro Mishkin and Jiri Matas
Affiliation: Czech Technical University in Prague
Published: 2015
Citations: 508
OpenGenus rating on Influence: 4.6

Reducing the dimensionality of data with neural networks

This is a fundamental paper that shows that we can reduce the dimensions in a data while training a Neural Network provided the starting point is well tuned. This works better than traditional dimension reduction techniques like Principal Component Analysis.

Paper: Reducing the dimensionality of data with neural networks
Authors: Geoffrey Hinton, RR Salakhutdinov
Affiliation: University of Toronto and Google
Published: 2006
Citations:15,500+
OpenGenus rating on Influence: 7.0

ImageNet Classification with Deep Convolutional Neural Networks

This is a fundamental paper as it demonstrates how to train a Neural Network to perform well on a specific task like Image Classification. This brings in several key ideas across ML and hence, is a must read.

Paper: ImageNet Classification with Deep Convolutional Neural Networks
Authors: Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton
Affiliation: University of Toronto and Google
Published: 2012
Citations: 83,000+
OpenGenus rating on Influence: 7.9

Dropout: a simple way to prevent neural networks from overfitting

This paper brings in the idea that during training of Neural Networks, some weights can be deleted (dropped out) to avoid the problem of overfitting. This was a major progress to make Neural Networks practical.

Paper: Dropout: a simple way to prevent neural networks from overfitting
Authors: Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov
Affiliation: University of Toronto and Google
Published: 2014
Citations: 29,000+
OpenGenus rating on Influence: 8.5

Learning internal representations by error-propagation

This is a fundamental and one of the earliest papers on ML. It solves some fundamental problems like XOR problem and Encoding problem. This is a must read if you are into understanding ML from the originators and how it progressed over the years.

Paper: Learning internal representations by error-propagation
Authors: David E Rumelhart, Geoffrey E Hinton, Ronald J Williams
Affiliation: University of California, San Diego
Published: 1986
Citations: 28,000+
OpenGenus rating on Influence: 5.0

Efficient backprop

Back propagation is the most critical operation in a ML model but it has some problems. This papers presents the solutions to the problems and explain why these work. This is an insightful paper and is a must read if you plan to understand ML fundamentally.

Paper: Efficient backprop
Authors: Yann A LeCun, Léon Bottou, Genevieve B Orr, Klaus-Robert Muller
Affiliation: AT&T Labs, Willamette University, GMD FIRST
Published: 2012
Citations: 400+
OpenGenus rating on Influence: 8.4

Generative adversarial nets

This is one of the recently influential paper in ML and it introduces the approach of Generative adversarial nets (GAN). This model have proved to be successful in several applications and have resolved some of the challenges of the previous techniques like CNN.

This is a must read.

Paper: Generative adversarial nets
Authors: Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio
Affiliation: Universite de Montreal
Published: 2014
Citations: 33,000+
OpenGenus rating on Influence: 9.7

How transferable are features in deep neural networks?

The idea is to train a specific model for a specific task and use the information for another model and task. This comes from the observation that the learnt features in the first layers is almost similar across different models and is similar to Gabor filters.

This is a significant paper as it gives us critical insight into the working of ML models.

Paper: How transferable are features in deep neural networks?
Authors: Jason Yosinski, Jeff Clune, Yoshua Bengio, Hod Lipson
Affiliation: Cornell University, University of Wyoming and University of Montreal
Published: 2014
Citations: 6000+
OpenGenus rating on Influence: 8.5

Very deep convolutional networks for large-scale image recognition

This paper investigated the impact of the depth of a ML model on the accuracy of a specific task like image recognition. This has significant practical implications and is a must read.

Paper: [Very deep convolutional networks for large-scale image recognition](https://arxiv.org/pdf/1409.1556.pdf(2014.pdf)
Authors: Karen Simonyan, Andrew Zisserman
Affiliation: University of Oxford
Published: 2014
Citations: 60,000+
OpenGenus rating on Influence: 9.0

Intriguing properties of neural networks

This is one of the most significant papers in understanding the inner working of Neural Networks. This paper highlights two counter intuitive properties that redefine how we understand Neural Networks.

If you want to make core contributions in ML, this is a must read paper.

Paper: Intriguing properties of neural networks
Authors: Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus
Affiliation: Google, New York University, University of Montreal and Facebook
Published: 2013
Citations: 7600+
OpenGenus rating on Influence: 9.5

Generating text with recurrent neural networks

Recurrent neural network are powerful sequence model but is not widely used because it is hard to train properly. This paper takes a practical approach and overcomes most of the challenges and demonstrates how to generate text using RNN.

Paper: Generating text with recurrent neural networks
Authors: Ilya Sutskever, James Martens, Geoffrey E Hinton
Affiliation: University of Toronto
Published: 2011
Citations: 1447
OpenGenus rating on Influence: 9.2

With this article at OpenGenus, you have a clear idea of the must read research papers on Machine Learning.