-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first draft of intro addition #246
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -34,41 +34,88 @@ learning methods more challenging or less fruitful to apply. | |
|
||
### What is deep learning? | ||
|
||
Deep learning is built on a biologically-inspired approach from machine learning | ||
termed neural networks. Each neuron in a computational neural network, termed a | ||
node, has inputs, an activation function, and outputs. Each value from the | ||
inputs is usually multiplied by some weight and combined and summarized by the | ||
activation function. The value of the activation function is then multiplied by | ||
another set of weights to produce the output `TODO: we probably need a figure | ||
here - I see no way that we don't include this type of description in our paper, | ||
despite the fact that it's been done tons of times before. - I'm really partial | ||
to this nature review's explanation about making non-linear problems linear - | ||
figure 1 [@doi:10.1038/nature14539]` These neural networks are trained by | ||
identifying weights that produce a desired output from some specific input. | ||
|
||
Neural networks can also be stacked. The outputs from one network can be used as | ||
inputs to another. This process produces a stacked, also known as a multi-layer, | ||
neural network. The multi-layer neural network techniques that underlie deep | ||
learning have a long history. Multi-layer methods have been discussed in the | ||
literature for more than five decades [@doi:10.1103/RevModPhys.34.135]. Given | ||
this context, it's challenging to consider "deep learning" as a new advance, | ||
though the term has only become widespread to describe analysis methods in the | ||
last decade. Much of the early history of neural networks has been extensively | ||
covered in a recent review [@doi:10.1016/j.neunet.2014.09.003]. For the purposes | ||
of this review, we identify deep learning approaches as those that use | ||
multi-layer neural networks to construct complex features. | ||
|
||
We also identify a class of algorithms that we term "shallow learning" | ||
approaches. We do not use this as a pejorative term, but instead to denote | ||
algorithms which have all of the hallmarks of deep approaches except that they | ||
employ networks of limited depth. We found it valuable to include these as we | ||
sought to identify the current contributions of deep learning and to predict its | ||
future impact. Researchers may employ these shallow learning methods for a | ||
number of reasons including: 1) shallow networks provide a degree of | ||
interpretability that better matches their use case; 2) the available data are | ||
insufficient to support deeper architectures, however new datasets that will | ||
support deep methods are expected; 3) or as building blocks to be combined with | ||
other non-neural-network-based approaches at subsequent stages. | ||
[This section needs citations] | ||
|
||
Deep Learning is a collection of new techniques that together have recently demonstrated | ||
breakthrough gains over existing approaches in several fields. | ||
|
||
Deep learning is built on a very old idea, neural networks, that was first | ||
proposed in 1943 [doi:10.1007/BF02478259] as a model for how biological | ||
brains proces information. Since then, interest in neural networks a computational | ||
models has waxed and waned several times. This history is interesting in its own right [@doi:10.1103/RevModPhys.34.135, @doi:10.1103/RevModPhys.34.135], | ||
but in recent years, with the advances of Deep Learning, attention has shifted back. | ||
|
||
Several important advances make the current surge of work done in this area possible. | ||
|
||
First, several easy to use software packages (Tensorflow, Caffe, Theano) now enable a much broader range of scientists | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an important point that we shouldn't lose. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree. Should we cite these with URLs though? Also maybe 'deep learning frameworks' better than 'software packages'. I'd add MxNet (and maybe even h20.ai) to that list. |
||
to build and train complicated. In the past, neural networks required very specialized knoweldge to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing word after "complicated" |
||
build and modify, including the ability to robustly code differentials of matrix | ||
expressions. Errors here are often subtle and difficult to detect, so it could be | ||
very difffult to tailor networks to specific problems without substantial experience | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "difficult" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Spelling: "knowledge" |
||
and training. Now, however, with these new packages, even very complex neural networks | ||
are automatically differentiated, and high level scripting instructions can transparently | ||
run very efficently on GPUs. The technology has progressed to the point that even | ||
algorithms can be differentiated [cite neural stack and neural memory papers]. | ||
|
||
Second, key technical insight has been uncovered that guides the construction of | ||
much more complicated networks that previously possible. In the past, most neural networks | ||
included just a single network. A network with an arbitrary number of hidden nodes, but | ||
just a single layer, can learn arbitrarily complex functions. And networks with more than one hidden | ||
layer (deep networks), were hard to train. However, it turns out, deep networks can more | ||
efficiently represent many tasks when they are built to mirror the underlying structure of the data. | ||
Moroever, deep networks are more robust and trainable when employing several | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Moreover" |
||
architectural innovations: weight replication, better behaived non-linearities like rectified-linear units, residual networks, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are all very good to point out, but I wonder whether our readers will have any idea what these mean or if they will be unintuitive jargon? |
||
and better weight initialization, and persistent memory. Likewise the central role of | ||
dimensionality reduction as a strength of neural networks was elucidated, and this has | ||
motivated designs built to capitalize on this strength [cite autoencoders and word2vec]. | ||
|
||
Third, several advances in training algorithms have enabled applications of neural networks in ways | ||
not obviously possible. The number of training strategies for deep learning is growing rapidly and a complete | ||
reveiw is beyond our scope. But these algorithms can train networks in domains where earlier algorithms struggled. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "review" |
||
For example, newer optimizers can very efficiently learn using batched training, where only a portion of the data | ||
needs to be processed at a time. These optimizers more effectively optimize very large weight vectors where many weights are only | ||
rarely updated. Noise constrastive error has proven particularly useful in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. contrastive |
||
modeling language. Reinforcement learning has enabled neural networks to learn how to play games | ||
like chess, GO, and poker. Curriculumn learning enables networks to gradually build up expertise to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Curriculum" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "These optimizers more effectively optimize" feels very awkward. Perhaps "These approaches more effectively optimize" or "These optimizers are more effective where" |
||
solve particularly challenging algorithmic problems. Dropout nodes and layers make networks much more | ||
robust, even when the number of weights are dramatically increased. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is quite a bit of jargon in here. If it's going to be included, citations need to exist to point readers towards a resource for each topic. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Curriculum" |
||
Fourth, the convergence of these factors currently makes deep learning extremely adaptable, and capable | ||
of addressing the nuanced differences of each domain to which it is applied. Many of the advances in deep learning | ||
were first developed for image analysis and text analysis, but the lessons and techniques learnt there | ||
enable the construction of very powerful models specifically suited to the challenges presented by | ||
each unique domain. | ||
|
||
### Deep learning in scientific inquiry | ||
|
||
Given the intrinsic flexibility of deep learning, it is important to consider the specific values and goals | ||
that are particularly important in scientific inquiry. | ||
|
||
First, in scientific contexts understanding the patterns in the data may be just as important as fitting the data. | ||
For this reason, interpretability can be more important here than other domains. Scientific work often | ||
aimes to understand the underlying principles behind the data we see, and architectures and techniques that expose the | ||
non-obvious patterns in the data are particualrly important and very active area of research [cite examples from all sections]. | ||
|
||
Second, there are important and pressing questions about how to build networks that can efficently represent | ||
the underlying logic of the data. This concern of "representability" is important, because it gives insight into | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. logic -> structure? |
||
the structure of scientific data, and when understood can guide the design of more efficent and effective networks. For example, | ||
one particularly important study was published in Science, which demonstrates that a simple neural network can very efficiently and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we need to emphasize where it was published |
||
accuratly model (i.e. respresent) a theoretically important quantum mechanical system [http://science.sciencemag.org/content/355/6325/602]. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Glad you introduced this paper, I was looking for a place to refer to it in the review. I think it is very relevant despite not being a biological example. Can you please switch to the doi before merging? |
||
|
||
Third, science is full of domain expertise, where there are deep traditions of thought stretching back decades and even centuries. Deep learning | ||
will always be in dialogue with thise expertise, to understand the key problems, encode the most salient prior knoweledge, and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "this"? "knowledge" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "accurately" |
||
understand how to judge success or failure. There is a great deal of excitement about deep learning, but in most scientific corners | ||
careful thought needs to be put into bringing deep learning alongside existing experts and efforts. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this a good place to mention the need to compare deep learning performance with existing best practices in a field? |
||
|
||
Fourth, data availability and complexity is unevenly distributed accross science. Some areas of science like genomics and particle physics are swamped in petabytes and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we can keep these domain examples to biomedicine to keep in theme? E.g. replace "chemistry" with "biochemistry" or "medicinal chemistry". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not just the amount of data but the complexity of the required features + number of examples. I worry about focusing solely on scale. |
||
exobytes of high quality data. Others, like chemistry, are comparatively data poor with well developed domain specific and effective algorithms. These | ||
differences become consequential and define the most successul approachs. For example, the convergence of lower amounts of data | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "approaches" |
||
and important nuances to the domain might favor lower parameter networks that incorporate domain specific knowledge and fuse data of multiple | ||
different types. This flexibility, remember, is one of the most striking strengths of neural networks. In the long run, | ||
it is an open question the most effect strategies will be, but in this time of creative experimenation optimism is justified. | ||
|
||
None of these scientific concerns should dampen enthusiasm about deep learning. Rather, because the approaches flexibility, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something missing in "approaches flexibility" |
||
there is good reason to believe that carefully defined networks might enbable important scientific advances. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "enable" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry about the delay on this. I just submitted a grant yesterday and am returning here. Will fix soon. |
||
|
||
### Will deep learning transform the study of human disease? | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove the comma between references. Also, automatic reflow would help keep the lines to <= 80 characters, which has made it easier for us to make and review small edits to the text.