This page summarises the deep learning resources I’ve consulted in my album cover classification project.
Tutorials and blog posts
- Convolutional Neural Networks for Visual Recognition Stanford course notes: an excellent resource, very up-to-date and useful, despite still being a work in progress
- DeepLearning.net’s Theano-based tutorials: not as up-to-date as the Stanford course notes, but still a good introduction to some of the theory and general Theano usage
- Lasagne’s documentation and tutorials: still a bit lacking, but good when you know what you’re looking for
- lasagne4newbs: Lasagne’s convnet example with richer comments
- Using convolutional neural nets to detect facial keypoints tutorial: the resource that made me want to use Lasagne
- Classifying plankton with deep neural networks: an epic post, which I found while looking for Lasagne examples
- Various Wikipedia pages: a bit disappointing – the above resources are much better
Papers
- Adam: a method for stochastic optimization (Kingma and Ba, 2015): an improvement over SGD with Nesterov momentum, AdaGrad and RMSProp, which I found to be useful in practice
- Algorithms for Hyper-Parameter Optimization (Bergstra et al., 2011): the work behind Hyperopt – pretty useful stuff, not only for deep learning
- Convolutional Neural Networks at Constrained Time Cost (He and Sun, 2014): interesting experimental work on the tradeoffs between number of filters, filter sizes, and depth – deeper is better (but with diminishing returns); smaller filter sizes are better; delayed subsampling and spatial pyramid pooling are helpful
- Deep Learning in Neural Networks: An Overview (Schmidhuber, 2014): 88 pages and 888 references (35 content pages) – good for finding references, but a bit hard to follow; not so good for understanding how the various methods work and how to use or implement them
- Going deeper with convolutions (Szegedy et al., 2014): the GoogLeNet paper – interesting and compelling results, especially given the improvement in performance while reducing computational complexity
- ImageNet Classification with Deep Convolutional Neural Networks (Krizhevsky et al., 2012): the classic paper that arguably started (or significantly boosted) the recent buzz around deep learning – many interesting ideas; fairly accesible
- On the importance of initialization and momentum in deep learning (Sutskever et al., 2013): applying Nesterov momentum to deep learning – good read, simple concept, interesting results
- Random Search for Hyper-Parameter Optimization (Bergstra and Bengio, 2012): very compelling reasoning and experiments showing that random search outperforms grid search in many cases
- Recognizing Image Style (Karayev et al., 2014): identifying image style, which is similar to album genre – found that using models pretrained on ImageNet yielded the best results in some cases
- Very deep convolutional networks for large scale image recognition (Simonyan and Zisserman, 2014): VGGNet paper – interesting experiments and architectures – deep and homogeneous
- Visualizing and Understanding Convolutional Networks (Zeiler and Fergus, 2013): interesting work on visualisation, but I’ll need to apply it to understand it better
Public comments are closed, but I love hearing from readers. Feel free to contact me with your thoughts.