Yesterday I posted the link on Google’s new photo search and how “deep learning” makes that possible. Here are some weekend reads on this topic. The two videos focusing on different aspect of this concept. The first one explains the concept and the implication. The second talk covers how to use a large distributed computing model to train the multiple layers.
What is Deep learning: Deep learning is set of algorithms in machine learning that attempt to learn layered models of inputs, commonly neural networks. The layers in such models correspond to distinct levels of concepts, where higher-level concepts are defined from lower-level ones, and the same lower-level concepts can help to define many higher-level concepts.
Deep learning – The Biggest Data Science Breakthrough of the Decade (60m video): This one talks more about the historical context and the Kaggle machine learning platform. Not particularly technically challenging but interesting enough since it provides lots of the applications of this technology. Machine learning and AI have appeared on the front page of the New York Times three times in recent memory: 1) When a computer beat the world’s #1 chess player 2) When Watson beat the world’s best Jeopardy players 3) When deep learning algorithms won a chemo-informatics Kaggle competition. We all know about the first two… but what’s that deep learning thing about?
Tera-scale deep learning – Quoc V. Le (60m video): Deep learning and unsupervised feature learning offer the potential to transform many domains such as vision, speech, and NLP. However, these methods have been fundamentally limited by our computational abilities, and typically applied to small-sized problems. In this talk, I describe the key ideas that enabled scaling deep learning algorithms to train a very large model on a cluster of 16,000 CPU cores (2000 machines). This network has 1.15 billion parameters, which is more than 100x larger than the next largest network reported in the literature. (via: data science 101)