-
Notifications
You must be signed in to change notification settings - Fork 29
Arabidopsis lyrata
The data set includes 84 Runs of A. lyrata. This yields ~ 45X coverage of the diploid (2n) A. lyrata genome. The library was prepared from ~24 ug of genomic DNA. The DNA sheared was split into two parts and size selected on a blue pippin at 7 kb, and 15 kb. These two libraries run with the P5 chemistry.
The data run through HGAP and the Celera assembler yields a diploid assembly, of a size ~353 Mb and and N50 of 252 kb.
The directory "9-terminator" has the various output from the Celera Assembler. We typically consider asm.ctg.fasta + asm.deg.fasta as the major draft assembly. The asm.deg.fasta has the contigs that Celera Assembler thinks degenerated. Namely, they are likely to be collapsed repeats.
The directory "Quiver_Polished" contains the final assembly after polishing by 1 round of Quiver SW (quiver.1.xml) This step should decrease the error rate in the assembly.
The dataset can be downloaded here: http://datasets.pacb.com.s3.amazonaws.com/2014/Arabidopsis-lyrata/list.html
Visit the PacBio Developer's Network Website for the most up-to-date links to downloads, documentation and more. Terms of Use | Trademarks | Contact Us