Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial draft of splicing #288

Merged
merged 12 commits into from
Apr 8, 2017
Merged

Initial draft of splicing #288

merged 12 commits into from
Apr 8, 2017

Conversation

bdo311
Copy link
Contributor

@bdo311 bdo311 commented Apr 4, 2017

Would welcome comments and suggestions! I think it's also a little long.

Used papers #238, #7

I can't seem to request a reviewer from the right sidebar... @agitter @cgreene

@agitter
Copy link
Collaborator

agitter commented Apr 4, 2017

Thanks! @cgreene @gwaygenomics or I will check it out and provide comments. I'm not worried about the length.

I did see that the Travis CI build failed reported a problem with @tag:Xiong2011_bayesian. There may be some leftover text in the tags.tsv files from a merge as well.

@agitter
Copy link
Collaborator

agitter commented Apr 7, 2017

@bdo311 I fixed the problems with the references and will have time to give more feedback today or tomorrow.

Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is very-well written. We could merge as-is. I did leave a couple questions, but these are minor.

#4 was listed in our issues but not included here. That is perfectly fine with me if we have nothing new to say about it, but I wanted to point it out.

interact in complex, incompletely characterized ways. With new tools to
interpret these meta-features, a major focus of current deep learning research,
we will soon have the ability to extract a more nuanced biological understanding
of splicing — perhaps by building different deep neural networks for different
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the idea of building different networks for different conditions. Is there something condition-specific that could not be learned from other types of input data? Does the cell line embedding idea used in #258 have any relevance here?

I'm not disagreeing with the sentiment, but I found it thought-provoking and was hoping you could elaborate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the hidden nodes learned by a network trained on brain data might tell you something about brain when compared to, for example, hidden nodes learned by a network trained on heart data.

Yeah #4 and #258 use a similar approach where they encode cell type in the input. One of the benefits to this multi-task learning is that learning is more efficient and that you can exploit similarities (rather than having to learn them individually) -- there are some papers (https://arxiv.org/pdf/1701.08734.pdf) that take this idea to its extreme. Downside is that maybe building a jack-of-all-trades model will reduce its single-task prediction accuracy. Maybe I'll drop in a sentence about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#4 and #7 (at least to my reading) use similar if not identical deep learning networks; they were developed by the same group (Frey's lab at UofT). So I chose just to cite the 2015 Science paper instead.

analysis using SNPs from three genes, it predicted exon skipping with three
times the accuracy of Xiong et al.’s framework. This case is instructive in that
clever sources of data, not just more powerful models, can lead to novel
insights.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What type of model is used in Rosenberg2015? Is it a neural network or something else? Or is the the simple linear model described in the next sentence?

I'm wondering whether the main conclusion is that

  1. the algorithms (neural networks) haven't been the bottleneck in exon skipping prediction but rather the data
  2. better data provides future opportunities to train even better algorithms (e.g. neural networks)
  3. I missed the point entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it's both. Different/complementary kinds of data provide opportunities to build more informative models, and better algorithms let you do more with integrating more diverse sources of data. I don't know if anyone has a definite answer on this or whether there should be a definite answer..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Rosenberg model is a simple linear model.

@agitter
Copy link
Collaborator

agitter commented Apr 8, 2017

Thanks for the responses to my comments. I'm merging this. When we make a second pass for further editing we can see if we want to say anything about the new paper #293.

@bdo311 some of your commits were authored with an email address that is not associated with your GitHub account. I think it should be okay, and I'll fix authorship after I squash and merge if needed.

@agitter agitter merged commit 34a5739 into greenelab:master Apr 8, 2017
dhimmel pushed a commit that referenced this pull request Apr 8, 2017
This build is based on
34a5739.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/deep-review/builds/220005126
https://travis-ci.org/greenelab/deep-review/jobs/220005127

[ci skip]

The full commit message that triggered this build is copied below:

Initial draft of splicing (#288)

* initial draft, merged correctly

* Fix DOI in tags

* added discussion about one-hot tissue encoding; other minor edits

* Remove duplicate tag
dhimmel pushed a commit that referenced this pull request Apr 8, 2017
This build is based on
34a5739.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/deep-review/builds/220005126
https://travis-ci.org/greenelab/deep-review/jobs/220005127

[ci skip]

The full commit message that triggered this build is copied below:

Initial draft of splicing (#288)

* initial draft, merged correctly

* Fix DOI in tags

* added discussion about one-hot tissue encoding; other minor edits

* Remove duplicate tag
@agitter
Copy link
Collaborator

agitter commented Apr 8, 2017

Confirming that the squashed commit was correctly attributed to @bdo311

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants