Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size of the *_report.tsv files vary a lot across different samples #312

Open
binbinZhao2017 opened this issue Sep 26, 2024 · 1 comment
Open

Comments

@binbinZhao2017
Copy link

Thanks for the wonderful tools.
I ran the tool on our tumor bulk RNAseq dataset and found the size of the *_report.TSV files vary a lot. Some samples have many more TCRs (Both types and reads) while some samples only have 1-2 TCRs, I know this may be due to the presence of fewer T cells in these samples, but there are also other possibilities, maybe these samples have less reads overall, I tried some deconvolution tools to deconvolute the immune cell types in the same data and found samples with almost same T cells fractions have different TCR reads and types.
So my question is do you do some kind of QC before analyzing the TCR and BCR so that data with too few reads can not go to the next step? Or you have any suggestions on how I can make the different samples comparable?
Thanks.

@mourisl
Copy link
Collaborator

mourisl commented Sep 27, 2024

Depending on the applications. For example, if you are focusing on tracking clonotypes, even a sample with a few clonotype could be informative. If you are calculating diversities, a sample with too few 1-2 TCRs may make the estimation highly skewed. In this case, I usually only consider samples with at least 10 distinct clonotypes (Other threshold should also work).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants