Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index splitting and specifying index for kb count #18

Open
retropc66 opened this issue Apr 29, 2022 · 0 comments
Open

Index splitting and specifying index for kb count #18

retropc66 opened this issue Apr 29, 2022 · 0 comments

Comments

@retropc66
Copy link

Hi Basil,

I'm trying to generate a loom file for RNA velocity using kb-python using the method you describe in your tutorial.

I just wrote out a detailed description of the problems I was having with kb count after generating a new reference from GENCODE vM25 - which led me to find a new troubleshooting option and the solution to my problem.

I ran the following command to build the reference; fasta.fa and genes.gtf are gunzipped copies of the GENCODE vM25 reference fa and gtf files:

kb ref -i indeces/index.idx -g t2g.txt -f1 cdna.fa -f2 intron.fa -c1 cdna_t2c.txt -c2 intron_t2c.txt --workflow lamanno -n 4 /lscratch/slurm-job-5124079/fasta.fa /lscratch/slurm-job-5124079/genes.gtf

Running this command generates a warning that the index splitting (-n) flag will be deprecated in the next major release - this led me to check the release notes for kb-python versions back to where index splitting was introduced in v0.25.0.

Use of the -n 4 flag in the kb ref command leads to the generation of four index files in the indeces directory:

  • index.idx_cdna
  • index.idx_intron.0
  • index.idx_intron.1
  • index.idx_intron.2

The v0.25.0 release notes state:

When -n is used the built indices must be passed in as a comma-delimited list to kb count

I made that change in my kb count command, which seems to have done the trick - although I don't see any loom files, so I may have to do some further tweaking.

I'd suggest a couple of updates to your tutorial (1) to remove index splitting from kb ref (or note that it will be deprecated), and/or (2) to clarify the specification of index file(s) in the kb count command. Your command has -i transcriptome.idx, which doesn't match the indexes generated by the kb ref command two lines above.

Thanks,

Chris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant