-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorporated changes from my proofread of the study section #493
Conversation
I'll review this Monday morning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some typos and suggestions
sections/04_study.md
Outdated
@@ -56,7 +56,7 @@ approaches applied to gene expression data are powerful methods for | |||
identifying gene signatures that may otherwise be overlooked. | |||
An additional benefit of unsupervised approaches is that | |||
ground truth labels, which are often difficult to acquire or are incorrect, are | |||
nonessential. However, careful interpretation must be performed regarding how | |||
nonessential. However, careful interpretation must be performed when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"careful interpretation must be performed" sounds way awkward to me. "interpretation must be careful when"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the genes that have been aggregated into features must be interpreted carefully"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworded 👍
its links to complex disease, which will lead to novel diagnostics and | ||
therapeutics. | ||
therapies to correct splicing defects. However, to achieve this we expect that | ||
methods to interpret the "black box" of deep neural networks and integrate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"integrate this with"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it as is. The "integrate" refers to multiple data sources.
sections/04_study.md
Outdated
would be very time consuming in a lab setting but was easy to simulate using | ||
their model. As we learn to better visualize and analyze the hidden nodes within | ||
base pairs in a sequence and see how the model changed its prediction. Though | ||
time consuming to assay in a lab, this was easy to simulate the computational |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this was easy to simulate the computational": word missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
sections/04_study.md
Outdated
million base pairs upstream or downstream from the affected promoter, on either | ||
strand, even within the introns of other genes [@doi:10.1038/nrg3458]. They do | ||
million base pairs upstream or downstream from the affected promoter on either | ||
strand even within the introns of other genes [@doi:10.1038/nrg3458]. They do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and even" / "or even"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
sections/04_study.md
Outdated
insights. | ||
|
||
### Single-cell data | ||
|
||
Single-cell methods are generating extreme excitement as biologists recognize | ||
Single-cell methods are generating excitement as biologists recognize | ||
the vast heterogeneity within unicellular species and between cells of the same | ||
tissue type in the same organism [@tag:Gawad2016_singlecell]. For instance, | ||
tumor cells and neurons can both harbor extensive somatic variation | ||
[@tag:Lodato2015_neurons]. Understanding single-cell diversity in all its | ||
dimensions — genetic, epigenetic, transcriptomic, proteomic, morphologic, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long dash or double dash? or will either work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
went with double dash. I think that's what we've done elsewhere 👍
sections/04_study.md
Outdated
specific individual, but also to specific pathological subsets of cells. | ||
Single-cell methods also promise to uncover a wealth of new biological | ||
knowledge. A sufficiently large population of single cells will have enough | ||
representative "snapshots" to recreate timelines of dynamic biological processes. | ||
If tracking processes over time is not the limiting factor, single-cell | ||
techniques can provide maximal resolution compared to averaging across all cells | ||
in bulk tissue, enabling the study of transcriptional bursting with single-cell | ||
FISH or the heterogeneity of epigenetic patterns with single-cell Hi-C or | ||
fluorescence in situ hybridization or the heterogeneity of epigenetic patterns with single-cell Hi-C or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
italicise in situ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, "in situ"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
sections/04_study.md
Outdated
@@ -586,23 +576,23 @@ for dealing with batch effects [@tag:Shaham2016_batch_effects]. | |||
|
|||
Examining populations of single cells can reveal biologically meaningful subsets | |||
of cells as well as their underlying gene regulatory networks | |||
[@tag:Gaublomme2015_th17]. Unfortunately, machine learning generally struggles | |||
[@tag:Gaublomme2015_th17]. Unfortunately, machine learning methods generally struggle | |||
with imbalanced data — when there are many more examples of class 1 than class 2 — |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a single hyphen not a long dash. Suggest this could all be cleaned up near end with a simple search and replace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
sections/04_study.md
Outdated
[@tag:Abe]. Then, researchers began to use techniques that could estimate | ||
relative abundances from an entire sample, which is much faster than classifying | ||
relative abundances from an entire sample more quickly than classifying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "faster" reads better than "more quickly"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
sections/04_study.md
Outdated
[@tag:Word2Vec] in natural language processing) for protein family | ||
classification have been introduced and classified with a skip-gram neural | ||
network [@tag:Asgari]. Recurrent neural networks show good performance for | ||
homology and protein family identification [@tag:Hochreiter @tag:Sonderby]. | ||
Interestingly, Hochreiter, who invented Long Short Term Memory (LSTM), delved | ||
into homology/protein family classification in 2007, and therefore, deep | ||
learning is deeply rooted in functional classification methods. | ||
|
||
One of the first techniques of *de novo* genome binning used self-organizing | ||
maps, a type of neural network [@tag:Abe]. Essinger et al. used Adaptive Resonance Theory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shift citation to just after Essinger et al.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments from me and @agapow, then looks good to me.
sections/04_study.md
Outdated
specific individual, but also to specific pathological subsets of cells. | ||
Single-cell methods also promise to uncover a wealth of new biological | ||
knowledge. A sufficiently large population of single cells will have enough | ||
representative "snapshots" to recreate timelines of dynamic biological processes. | ||
If tracking processes over time is not the limiting factor, single-cell | ||
techniques can provide maximal resolution compared to averaging across all cells | ||
in bulk tissue, enabling the study of transcriptional bursting with single-cell | ||
FISH or the heterogeneity of epigenetic patterns with single-cell Hi-C or | ||
fluorescence in situ hybridization or the heterogeneity of epigenetic patterns with single-cell Hi-C or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, "in situ"
outperforming logistic regression and distance-based outlier detection methods. | ||
However, they did not benchmark against random forests, which tend to work better | ||
for imbalanced data, and their data was | ||
relatively low dimensional. Future work is needed to establish the utility of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In light of #495, I don't see how improvements in image classification tell us anything about cell subset identification. Can we stop the sentence after "cell subset identification."?
sections/04_study.md
Outdated
@@ -56,7 +56,7 @@ approaches applied to gene expression data are powerful methods for | |||
identifying gene signatures that may otherwise be overlooked. | |||
An additional benefit of unsupervised approaches is that | |||
ground truth labels, which are often difficult to acquire or are incorrect, are | |||
nonessential. However, careful interpretation must be performed regarding how | |||
nonessential. However, careful interpretation must be performed when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the genes that have been aggregated into features must be interpreted carefully"?
its links to complex disease, which will lead to novel diagnostics and | ||
therapeutics. | ||
therapies to correct splicing defects. However, to achieve this we expect that | ||
methods to interpret the "black box" of deep neural networks and integrate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it as is. The "integrate" refers to multiple data sources.
This build is based on c0cbf63. This commit was created by the following Travis CI build and job: https://travis-ci.org/greenelab/deep-review/builds/234828498 https://travis-ci.org/greenelab/deep-review/jobs/234828499 [ci skip] The full commit message that triggered this build is copied below: Incorporated changes from my proofread of the study section (#493) * initial proofreads up to metagenomics * finish proofread * address comments * address build failure
This build is based on c0cbf63. This commit was created by the following Travis CI build and job: https://travis-ci.org/greenelab/deep-review/builds/234828498 https://travis-ci.org/greenelab/deep-review/jobs/234828499 [ci skip] The full commit message that triggered this build is copied below: Incorporated changes from my proofread of the study section (#493) * initial proofreads up to metagenomics * finish proofread * address comments * address build failure
No description provided.