-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File _rgdict.avro does not exist #1150
Comments
Hi @ooliynyk! The |
Hi @fnothaft - Thank you for your reply. I'm working with @ooliynyk on this project. We have installed ADAM as part of Sequencing.com's Altruist Database, a free, open-data initiative. The database contains human genome VCF files as well as those files converted to ADAM format. Our goal is to enable users of the Altruist Database to be able to utilize the power of ADAM for any type of analysis they want to perform on one or more genotypic files within the Altruist Database. For example, they can select to perform analysis on all Altruist records that are female or they may choose to perform analysis on all Altruist records that are carriers of a Cystic Fibrosis variant in the CFTR gene. Using the Altruist UI (which is still in development), users will be able to perform analysis of data within the Altruist Database by uploading or entering their own commands, by programming ADAM according to their own specs or use the commands/programs created and shared by other users. We were testing out the count_kmers command to make sure it worked on our dataset in-case that command was entered by a user. Hope this info was helpful. Any advice and guidance you can provide will be much appreciated so we can make sure that we enable ADAM to be used to its full potential. |
Very interesting use case, @ooliynyk @BrandonColbyMD! As @fnothaft mentioned, What do you think? |
Hi Hi @heuermh and @fnothaft - wanted to follow up about @plexteq question above. We are in the process of implementing operations which users can use on a user-defined subset of ADAM files in the Altruist Database. All ADAM files are being converted from gVCF files using vcf2adam so they aren't from AVRO files. Please let us know what operations you recommend to allow for analyzing one or more ADAM files within the Altruist Database. Thank you! |
@BrandonColbyMD ADAM is kind of like the swiss-army knife for getting data in traditional bioinformatics file formats ready for analysis on Spark; most of the interesting bits can be found in downstream repositories or in workbooks. I'll let others chime in with specific examples. |
Closing as this was a version change issue. |
I have converted vcf file to adam format using the command
# adam-submit vcf2adam file:///tmp/A7VAGPU.vcf.gz file:///tmp/a7.adam
When I tried to run
# adam-submit count_kmers file:///tmp/a7.adam/ file:///tmp/kmers.adam 10
I got error:Files in .adam directory:
A7VAGPU.vcf.gz
The text was updated successfully, but these errors were encountered: