We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello,
I downloaded the FASTQ files for sample GSM4339771 (SRR11181956) from SRA in the original format. (https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR11181956) So I end up with these two files
C143_R1.fastq.gz.1
C143_R2.fastq.gz.1
I was able to identify the cell barcodes with umi_tools
umi_tools whitelist --stdin C143_R1_test.fastq.gz \ --bc-pattern=CCCCCCCCCCCCCCCCNNNNNNNNNN \ --set-cell-number=100 \ --log2stderr > whitelist.txt;
However, when I tried the next step; extracting the barcodes and UMIs and add to read names
umi_tools extract --bc-pattern=CCCCCCCCCCCCCCCCNNNNNNNNNN \ --stdin C143_R1.fastq.gz \ --stdout C143_R1_extracted.fastq.gz \ --read2-in C143_R2.fastq.gz \ --read2-out=C143_R2_extracted.fastq.gz \ --filter-cell-barcode \ --whitelist=whitelist.txt;
I get the following error message
ValueError: Read pairs do not match CL200152206L1C001R001_0/1 != CL200152206L1C001R001_0/2
What am I doing wrong?
Thanks in advance for your time and attention,
Yered
The text was updated successfully, but these errors were encountered:
Hi @yeredh - Could you please confirm which version of umi_tools you are using. Thanks.
umi_tools
Sorry, something went wrong.
Hi @TomSmithCGAT ,
I am using
UMI-tools version: 1.0.1
Also, I noticed that the files were generated on a BGISEQ sequencer not Illumina. So I guess the headers have a different format.
BGISEQ
Hi,
I was directed to the solution here: #325
The following does work
umi_tools extract --bc-pattern=CCCCCCCCCCCCCCCCNNNNNNNNNN \ --stdin C143_R1.fastq.gz \ --stdout C143_R1_extracted.fastq.gz \ --read2-in C143_R2.fastq.gz \ --read2-out=C143_R2_extracted.fastq.gz \ --filter-cell-barcode \ --read-name-suffix-strip \ --whitelist=whitelist.txt;
No branches or pull requests
Hello,
I downloaded the FASTQ files for sample GSM4339771 (SRR11181956) from SRA in the original format.
(https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR11181956)
So I end up with these two files
C143_R1.fastq.gz.1
C143_R2.fastq.gz.1
I was able to identify the cell barcodes with umi_tools
However, when I tried the next step; extracting the barcodes and UMIs and add to read names
I get the following error message
What am I doing wrong?
Thanks in advance for your time and attention,
Yered
The text was updated successfully, but these errors were encountered: