Introduction: --Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, partially supervised or unsupervised.
--Some representations are loosely based on interpretation of information processing and communication patterns in a biological nervous system, such as neural coding that attempts to define a relationship between various stimuli and associated neuronal responses in the brain.Research attempts to create efficient systems to learn these representations from large-scale, unlabeled data sets.
--Deep learning architectures such as deep neural networks, deep belief networks and recurrent neural networks have been applied to fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation and bioinformatics where they produced results comparable to and in some cases superior to human experts.
Deep learning is a class of machine learning algorithms that:
(a)use a cascade of multiple layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output from the previous layer as input.
(b)learn in supervised (e.g., classification) and/or unsupervised (e.g., pattern analysis) manners.
learn multiple levels of representations that correspond to different levels of abstraction; the levels form a hierarchy of concepts.
(c)use some form of gradient descent for training via backpropagation.
--Layers that have been used in deep learning include hidden layers of an artificial neural network and sets of propositional formulas.They may also include latent variables organized layer-wise in deep generative models such as the nodes in Deep Belief Networks and Deep Boltzmann Machines. Concept:
The assumption underlying distributed representations is that observed data are generated by the interactions of layered factors.
--Deep learning adds the assumption that these layers of factors correspond to levels of abstraction or composition. Varying numbers of layers and layer sizes can provide different degrees of abstraction.
--Deep learning exploits this idea of hierarchical explanatory factors where higher level, more abstract concepts are learned from the lower level ones.
--Deep learning architectures are often constructed with a greedy layer-by-layer method. Deep learning helps to disentangle these abstractions and pick out which features are useful for improving performance.
--For supervised learning tasks, deep learning methods obviate feature engineering, by translating the data into compact intermediate representations akin to principal components, and derive layered structures that remove redundancy in representation.
--Deep learning algorithms can be applied to unsupervised learning tasks. This is an important benefit because unlabeled data are more abundant than labeled data. Examples of deep structures that can be trained in an unsupervised manner are neural history compressors and deep belief networks.
Deep neural networks:
--A deep neural network (DNN) is an Artificial Neural Network (ANN) with multiple hidden layers between the input and output layers.Similar to shallow ANNs,DNN's can model complex non-linear relationships. DNN architectures generate compositional models where the object is expressed as a layered composition of primitives. The extra layers enable composition of features from lower layers, potentially modeling complex data with fewer units than a similarly performing shallow network.
--Deep architectures include many variants of a few basic approaches. Each architecture has found success in specific domains. It is not always possible to compare the performance of multiple architectures, unless they have been evaluated on the same data sets.
--DNNs are typically feedforward networks in which data flows from the input layer to the output layer without looping back.
--Recurrent neural networks (RNNs), in which data can flow in any direction, are used for applications such as language modeling. Long short-term memory is particularly effective for this use.
--Convolutional deep neural networks (CNNs) are used in computer vision. CNNs also have been applied to acoustic modeling for automatic speech recognition (ASR).
Software libraries
Deeplearning4j—An open-source deep-learning library written for Java/C++ with LSTMs and convolutional networks. It provides parallelization with Spark on CPUs and GPUs, and imports models from Keras, Tensorflow and Theano.
Gensim—A toolkit for natural language processing implemented in the Python programming language.
Keras—An open-source deep learning framework for the Python programming language.
Microsoft CNTK (Computational Network Toolkit)—Microsoft's open-source deep-learning toolkit for Windows and Linux. It provides parallelization with CPUs and GPUs across multiple servers.
MXNet—An open source deep learning framework that allows you to define, train, and deploy deep neural networks. Backed by AWS.
OpenNN—An open source C++ library which implements deep neural networks and provides parallelization with CPUs.
[Paddle]—An open source C++ /CUDA library with Python API for scalable deep learning platform with CPUs and GPUs, originally developed by Baidu.
[Pytorch] (http://pytorch.org/)- Tensors and Dynamic neural networks in Python with GPUs. The Python version of Torch, associated with Facebook.
TensorFlow—Google's open source machine learning library in C++ and Python with APIs for both. It provides parallelization with CPUs and GPUs.
Torch—An open source software library for machine learning based on the Lua programming language and used by Facebook.
Caffe- Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Focused on image processing.