-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update data sets on the MGI FTP site #14
Comments
Hello, I'd be interested in accessing the full data as well. I wonder if that is possible. Especially since ERCC seems heavily downsampled and most comparisons cannot be made due to lack of data. |
Hi @ialbert, Thanks for your interest. We should be able to do this. @kcotto and @zlskidmore can we find the original full raw datasets as described here: http://genomedata.org/rnaseq-tutorial/testdata/bams/brain_vs_uhr_w_ercc/instrument_data.tsv and place these in a new sub-folder to indicate the full data without downsampling. |
Hi @malachig thanks for the response. Another possible solution would be to submit the data to SRA (both the subsampled and the complete one). That way you would not need to distribute/maintain it yourselves. An added benefit would be that the students could also practice the process of obtaining data (and metadata) from SRA using the command line. This is a skill that is becoming increasingly important. |
This would indeed be a great addition. I think the original data may be lost but there is a comparable dataset already available in SRA that we could switch over to. |
Migrated from rnabio.org proposed improvements page
Use a larger data set that has not been so heavily downsampled
Put the original UHR and HBR instrument data on the FTP site
Put the instrument data for the new hcc1395 data on the FTP site
MGI notes on this data will be here: https://confluence.gsc.wustl.edu/display/CI/Cancer+Informatics+Test+Data
The text was updated successfully, but these errors were encountered: