Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--length-max ignored or toned down? #15

Open
FiniDG opened this issue Dec 4, 2018 · 1 comment
Open

--length-max ignored or toned down? #15

FiniDG opened this issue Dec 4, 2018 · 1 comment

Comments

@FiniDG
Copy link

FiniDG commented Dec 4, 2018

I have a question, because I ran the command:
pbsim --data-type CLR --depth 40 --length-min 500 --length-mean 12000 --length-max 40000 --accuracy-min 0.85 --accuracy-mean 0.87 --model_qc /PBSIM-PacBio-Simulator/data/model_qc_clr /path/to/ref.fasta --prefix /path/to/output/
and
pbsim --data-type CLR --depth 40 --length-min 500 --length-mean 12000 --length-max 70000 --accuracy-min 0.85 --accuracy-mean 0.87 --model_qc /PBSIM-PacBio-Simulator/data/model_qc_clr /path/to/ref.fasta --prefix /path/to/output/

And still (whatever I fill in as --length-max), My largest read is around 26000

What is the reason for this? the ref.fasta I used is only 14 Mb in size, so might that be a reason? Or does PBSIM wants to make a specific read length distribution?

@FiniDG
Copy link
Author

FiniDG commented Dec 11, 2018

After further investigation I think I found the answer. I put my thoughts here for future users.
You also have a parameter called --length-sd, which is set to default 2300 for CLR data-type. I believe this is the reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant