-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"OverflowError: FASTA/FASTQ record does not fit into buffer" when trimming ONT reads #783
Comments
Hi, that’s interesting. By default, the largest FASTQ record may have 4 million bytes. This includes the quality values, so the maximum read length is about 2 Mbp. I thought this was enough ... There is actually a hidden (and I believe undocumented) command-line option |
Ah fantastic! I had found the corresponding line in your code and was about to edit it, but this is much more convenient. I would say it is not rare to have reads of a few megabases with the ultra long protocols, so might be good to eventually increase the default for this buffer. I think a max read size of ~8 megabases should be pretty safe. Thanks a lot! |
I can confirm that |
Awesome! Let me re-open this until I’ve found a more permanent solution. Maybe I can make the buffer size dynamic or so. |
You could try the following pattern: while True:
try:
for chunk in dnaio.read_chunks(files[0], self.buffer_size):
pass
except OverFlowError:
self.buffer_size *= 2
logging.warning("Keep some RAM sticks at the ready!")
continue
else:
break # or return to escape the loop |
The strategy is good, but just ignoring the exception and re-trying will lose the contents of the buffer. This would have to be done within |
Whoops, you are right. I incorrectly assumed blocks were passed rather than files. |
Hi @marcelm
I'm using cutadapt 4.4 with python 3.10.12 and I'm stumbling into this error when trimming the ultra long ULK114 adapters from a specific ONT Promethion flowcell. I'm wondering whether it is related to it having a few megabase size reads.
This is a description of the content of the file:
This is the command:
This is the output:
Many thanks!
The text was updated successfully, but these errors were encountered: