-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new BigBed adapter #944
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested BigBed, but BigWig still works fine. Nice catch on the bytesLength.
@@ -97,7 +97,7 @@ return declare([ SeqFeatureStore, DeferredFeaturesMixin, DeferredStatsMixin ], | |||
}, | |||
|
|||
_load: function() { | |||
this._read( 0, 512, lang.hitch( this, function( bytes ) { | |||
this._read( 0, 2000, lang.hitch( this, function( bytes ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be 2048? Not sure it would make any difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bigbed file I tested had a larger header than 512 so I arbitrarily increased it a little. There might be a better value to use but it doesn't hurt unless maybe there was a empty bigwig/bigbed file less than 2000bytes in length
Updated to add ability to render genes encoded in bigbed e.g. gencode Used https://github.com/dasmoth/gtf2bed to convert gencode annotations to bigbed Basically the blockstart/blockend defines exons and thickStart and thickEnd intersection with the blocks defines CDS that makes the gene glyph :) |
8c92c6e
to
e1309ab
Compare
The autosql parsing is now "robustified" and a sample track on the volvox sample data is available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code-wise, this looks great, just a couple of little nits to pick. I assume you have tested this on a variety of bigbed files?
tests/js_tests/spec/BigBed.spec.js
Outdated
expect( features[edenIndex].get('subfeatures').length ).toEqual( 8 ); | ||
}); | ||
}).call(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the expect() calls above should be in an it()
call, are you sure they are actually running?
have you tested this with larger bigbed files, like human ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be awesome to finally get bigbed support!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think expect() is the proper way to run assertions inside of the larger it() calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main things I tested the bigbed code on were ENCODE BigBed files (simple box features) and I manually converted gencode GTF files into BigBed (gene features e.g. mRNA with exons and the thickStart/thickEnd attribute representing boundaries of CDS)
I could make those files available somewhere if there's interest but being human data they are fairly large
tests/js_tests/spec/BigBed.spec.js
Outdated
return false; | ||
}); | ||
console.log(edenIndex); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you remove this log statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed :)
Would be easy for you to make `xit` test cases for them, with the download
URL in a comment?
…On Tue, Apr 3, 2018 at 10:09 AM Colin Diesh ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In tests/js_tests/spec/BigBed.spec.js
<#944 (comment)>:
> + expect( features.length ).toEqual( 4 );
+ var edenIndex;
+ array.some( features, function(f,i) {
+ if( f.get('name') == 'EDEN.1' ) {
+ edenIndex = i;
+ return true;
+ }
+ return false;
+ });
+ console.log(edenIndex);
+
+ expect( edenIndex ).toBe( 0 );
+ expect( features[edenIndex].get('subfeatures').length ).toEqual( 8 );
+ });
+ }).call();
+
The main things I tested the bigbed code on were ENCODE BigBed files
(simple box features) and I manually converted gencode GTF files into
BigBed (gene features e.g. mRNA with exons and the thickStart/thickEnd
attribute representing boundaries of CDS)
I could make those files available somewhere if there's interest but being
human data they are fairly large
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#944 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAEgFX2y5Iepk9HPZsbFThrCS-pddVTrks5tk6ytgaJpZM4QxCkH>
.
|
That should be fine to add some xit tests. I can try and find a place to out the test files, maybe jbrowse.org server? |
Eh I think if you just link to wherever out on the web they are, it will be
fine for now. The links will eventually break, but I’m sure they can be
updated then. Won’t block the automation cause they are xit
…On Tue, Apr 3, 2018 at 6:34 PM Colin Diesh ***@***.***> wrote:
That should be fine to add some xit tests. I can try and find a place to
out the test files, maybe jbrowse.org server?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#944 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAEgFZGIfxPZCK_PzkGO1JRqXkNG7h2nks5tlCMpgaJpZM4QxCkH>
.
|
Found some existing gencode bigbed on trackhubregistry that could work. Example ftp://ngs.sanger.ac.uk/production/gencode/update_trackhub/mm10/trackDb.txt |
Also, we can't have hardcoded 'mRNA' feature types. Need to have some kind of configuration scheme for that. |
RE: hardcoded feature type I guess if a autoSql contained a "type" field it could set the type of the feature implicitly (in the bed12 it does currently hardcode it mRNA anyways, but that could be reverted to allow otherwise). In the wild (the gencode data for ex) it has geneBioType which doesn't necessarily map cleanly onto sequenceontology types |
Yeah we just need to make sure our track configuration system is powerful enough that people can achieve good results for each specific dataset. For gencode, maybe a user would set something like:
|
@cmdcolin how did you generate docs/tutorial/data_files/volvox.bb? it's got the gencode autoSql schema, which isn't right for what it actually contains |
and i can't get |
I used gtf2bed from https://github.com/dasmoth/gtf2bed and used that on the volvox gtf |
Then i used bedToBigBed with the resulting bed file and the gencode.as in the gtf2bed repo (obviously that was wrong though!) |
Is it actually incorrect though? It seems like there are 17 columns in the bed and 17 in the gencode.as. Is some field mismatching? |
…rt/thickend natively
…n BigBed and feature detail dialogs
Looks like EDEN.1 not showing up in sample data, I think due to an "id collision" |
This PR adds a new class that can access BigBed files.
Luckily, basically all of the logic for parsing BigBed was already there in the bigwig code, so this simply adds a small class that adapts to that (e.g. BigBed doesn't use zoom levels) and fixes a couple things in the existing logic (e.g. "byteLength" instead of "length" for arraybuffer)
Here's a screenshot with a simple bed file
Potentially there are some more intricacies to bigbed that need to be handled but for very simple bed files this sees to works ok