Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use single-end reads. Error: Missing required value: fastq_2 #394

Open
melissamlwong18 opened this issue Nov 13, 2024 · 5 comments
Open
Labels
enhancement New feature or request

Comments

@melissamlwong18
Copy link

melissamlwong18 commented Nov 13, 2024

Description of feature

Hi, I couldn't run the pipeline on some smart-seq datasets which contains only single-end fastq files. I tried with two different samplesheets and get an error that fastq_2 is not specified. Is there any way to fix it or workaround (i.e. create a fake fastq2 file)?

I also want to run a Drop-seq dataset which contains 4 read files. The usage of this pipeline is a bit limited if it only accepts paired-end fastq files.

Samplesheet 1:
sample,fastq_1
SRR5907280,SRR5907280trim_S1_L001_R1_001.fastq.gz

Samplesheet 2:
sample,fastq_1,fastq_2
SRR5907280,SRR5907280trim_S1_L001_R1_001.fastq.gz,

Error:
ERROR ~ ERROR: Validation of 'input' file failed!
-- Check '.nextflow.log' file for details
The following errors have been detected:

  • -- Entry 1: Missing required value: fastq_2

Cheers,
Melissa

@melissamlwong18 melissamlwong18 added the enhancement New feature or request label Nov 13, 2024
@grst
Copy link
Member

grst commented Nov 13, 2024

Hi @melissamlwong18 ,

Please use the rnaseq pipeline for smartseq. It doesn't need any demultiplexing and can be processed like bulk RNA-seq data.
Regarding drop-seq, what are the four read files? I'd need to double-check for drop-seq, but often the index reads are not required.

@melissamlwong18
Copy link
Author

melissamlwong18 commented Nov 13, 2024

What I want is a count table with all the smart-seq samples (one fastq file for each sample) generated by kallisto and kallisto can assign a fake cell barcode to each sample. I'm able to run kallisto or kb locally using single-end fastq files. I want to combine this data with other scRNAseq datasets (10X, Drop-seq etc) in the downstream analysis. It will be nice to use the same nextflow pipeline to process all the datasets.

Indrop reads
R1 - 61 bases - read
R2 - 8 bases - cellular barcode
R3 - 14 bases - cellular barcode and umi
R4 - 8 bases

Results from kb --list
INDROPSV3 inDrops version 3 yes 0,0,8 1,0,8 1,8,14 2,None,None

I have to specify the fastq files in the following order R2, R3 and R1 to kb count.

I found some discussions of running start solo on drop-seq data which involves concatenating the reads into two read files. Haven't try it yet. See alexdobin/STAR#825

@grst
Copy link
Member

grst commented Nov 13, 2024

I found some discussions of running start solo on drop-seq data which involves concatenating the reads into two read files. Haven't try it yet. See alexdobin/STAR#825

This would very likely work.

Regarding smartseq2, I can see why you would want to do this, but still you could achieve something very similar with the rnaseq pipeline and the kallisto alignment route. I'm also not sure if any of the other alignment tools would support something like 'fake barcodes', and we strive for universal solutions here.

Regarding indrops v3, I can also see that it would be more convenient to specify fastq_1,_2,_3,_4 directly. I'd consider adding this, but for me it's currently not a priority. Happy to accept a PR.

@melissamlwong18
Copy link
Author

I'm using the nac index with kallisto or kb. It has been demonstrated that including introns increases sensitivity. Some of the datasets I want to combine are from single nucleus experiments. I don't think the rnaseq pipeline offers nac workflow.

@grst
Copy link
Member

grst commented Nov 14, 2024

ok, fair enough

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants