Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRUST4 config for 10X Genomics #272

Open
khoidnyds opened this issue May 20, 2024 · 6 comments
Open

TRUST4 config for 10X Genomics #272

khoidnyds opened this issue May 20, 2024 · 6 comments

Comments

@khoidnyds
Copy link

khoidnyds commented May 20, 2024

Hello,
I have 2 10X Genomics datasets (BCR and GEX). What are the appropriate parameters setting for TRUST4?
I'm using
f"run-trust4 -1 {f1} -2 {f2} --barcode {f1} --UMI {f1} --readFormat bc:0:15,um:16:25,r1:26:-1 --barcodeWhitelist {whitelist_barcodes} --od {out_dir} -f {hg38_bcrtcr_path} --ref {human_IMGT_path} -t 24 --repseq" for BCR
and
f"run-trust4 -1 {f1} -2 {f2} --barcode {f1} --UMI {f1} --readFormat bc:0:15,um:16:25,r1:26:-1 --barcodeWhitelist {whitelist_barcodes} --od {out_dir} -f {hg38_bcrtcr_path} --ref {human_IMGT_path} -t 24" for GEX
whitelist_barcodes = cellranger-8.0.0/lib/python/cellranger/barcodes/3M-5pgex-jan-2023.txt

but the results are so little of # VDJs compared to Cellranger. I added the first few lines of {f1} and {f2} and index {f1} here too
Thank you
Screenshot 2024-05-14 at 11 31 06 AM

@mourisl
Copy link
Collaborator

mourisl commented May 20, 2024

Which version of TRUST4 are you using? If it is v1.1.1, you can run TRUST4 on the BCR-seq portion without "--repseq" option. How many reads in the toassemble_bc.fq file have "missing_barcode" value? This can be useful to check whether the barcode portion is extracted appropriately. Thank you.

@khoidnyds
Copy link
Author

khoidnyds commented May 20, 2024

I'm using TRUST4 v1.1.1-r505.
I checked the toassemble_bc.fa and most of the reads having missing_barcode values. Shouldn't the barcodes always be the first 16 bases of R1?
My dataset lengths are mixed. One set has R1 of 26, R2 of 91. Another set has R1 = R2 = 151 bp. What do you recommend for the --readFormat parameter? Thank you

@khoidnyds
Copy link
Author

And should I include UMI information when running trust4 or just ignore that?

@mourisl
Copy link
Collaborator

mourisl commented May 20, 2024

The UMI information affects little with the final results, you can include it. Do you see the same fraction of barcode as "missing_barcode" from the GEX side? If you have some of the cellranger VDJ results, you can also check whether the barcode resides on the first 16bp of R1.

@khoidnyds
Copy link
Author

khoidnyds commented May 22, 2024

Hi mourisl,
Thanks for helping.
I checked the cell_id in _barcode_airr.tsv. Some located at first 16 bp (as expected). but some don't exist in R1.fastq.gz. I can't explain what happened here. Could you give me some hints?
Thanks

@mourisl
Copy link
Collaborator

mourisl commented May 22, 2024

Those could be from error-corrected barcodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants