Lowered barcode recognition of bonito basecalled data with Bonito 0.4.0 #176

menickname · 2021-08-30T13:48:02Z

I do experience a similar issue as in issue 26 earlier. After training a new model using the Bonito 0.4.0 software the demultiplexing (qcat) command results in only 40-70% of the reads being addressed to the correct barcode. I am using a subset of my dataset which is already a selection of a single barcode within the initial fast5 files. Hence, I would expect a significant higher number (>80%) of reads designated to my barcode.

Thank you in advance.
Regards,
Nick

menickname · 2021-09-13T08:09:01Z

Dear @iiSeymour

Any update on this yet?

Thanks a lot!

iiSeymour · 2021-09-14T10:21:39Z

Hey @menickname

See #175 - can you try with --ctc-min-coverage 0.99, also filtering out any lower quality reads should help.

menickname · 2021-09-16T08:45:31Z

Dear @iiSeymour

Unfortunately this does not result in better demultiplexing. Only high quality and longest reads were used for model training. One of my datasets results in only 33.84% of the reads being classified (original Guppy basecalled dataset 70-80%) with or without the --ctc-min-coverage 0.99 option. The issues seems not to be solved in this way.

I would also be surprised that after Bonito basecalling I have a reduction of higher quality reads? I have verified my model on separate (single isolate and no multiplexed) files on the generation of higher accurate genomes of my species of interest and this gave a significant increase, hence I am rather surprised this is happening during the demultiplexing.

Any other thoughts?
Thank you in advance.

mbhall88 · 2022-05-03T05:59:08Z

I have seen a similar issue when using a bonito-trained model with guppy. I lose a HUGE amount of reads to the dreaded "unclassified" bin.

Have you managed to find any way of recovering these lost reads @menickname?

CWYuan08 · 2022-12-12T11:03:27Z

Hi @mbhall88 and @menickname, I am experiencing the exact same problem, do you have any update on this issue? any progress/experience will be greatly appreciated. Thank you very much!

mbhall88 · 2022-12-12T22:38:04Z

Hi @CWYuan08, sadly no. I tried a lot of different things - e.g., chopping raw signal of the start and end before training etc. But to no avail.

I basically had to abandon the project as I couldn't justify losing so many reads to demultiplexing

CWYuan08 · 2022-12-13T09:44:09Z

Thank you @mbhall88 for sharing your update, sorry to hear you had to stop there.

menickname · 2022-12-13T09:50:37Z

hi @CWYuan08 and @mbhall88, I have indeed not found a solution on the Bonito demultiplexing itself. To still make use of the Bonito tool, I use demultiplexed files from MinKNOW as input. Since we are using a GridION sequencing device, we perform real-time super-accurate base calling and demultiplexing with Guppy (within MinKNOW). This generates both fastq and fast5 files per barcode. Then I simply use the demultiplexed fast5 files as input for the Bonito software. Not the most efficient solution, but it is how I can still use Bonito for base calling with custom models.

fergsc · 2023-06-02T01:44:54Z

Would using an existing model and improving it (--pretrained) for our species of interest be a better strategy?

iiSeymour self-assigned this Sep 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lowered barcode recognition of bonito basecalled data with Bonito 0.4.0 #176

Lowered barcode recognition of bonito basecalled data with Bonito 0.4.0 #176

menickname commented Aug 30, 2021

menickname commented Sep 13, 2021

iiSeymour commented Sep 14, 2021

menickname commented Sep 16, 2021

mbhall88 commented May 3, 2022

CWYuan08 commented Dec 12, 2022

mbhall88 commented Dec 12, 2022

CWYuan08 commented Dec 13, 2022

menickname commented Dec 13, 2022

fergsc commented Jun 2, 2023

Lowered barcode recognition of bonito basecalled data with Bonito 0.4.0 #176

Lowered barcode recognition of bonito basecalled data with Bonito 0.4.0 #176

Comments

menickname commented Aug 30, 2021

menickname commented Sep 13, 2021

iiSeymour commented Sep 14, 2021

menickname commented Sep 16, 2021

mbhall88 commented May 3, 2022

CWYuan08 commented Dec 12, 2022

mbhall88 commented Dec 12, 2022

CWYuan08 commented Dec 13, 2022

menickname commented Dec 13, 2022

fergsc commented Jun 2, 2023