Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A USER ERROR has occurred: Bad input: Sample $name is not in BAM header: [...] #1655

Open
VivianRobin opened this issue Sep 18, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@VivianRobin
Copy link

VivianRobin commented Sep 18, 2024

Description of the bug

A USER ERROR has occurred: Bad input: Sample P-*_P-*-REF-100X is not in BAM header: [_P--REF-100X,_-TMN-150X]````
I have the same error as the issue : https://github.com/nf-core/sarek/issues/7322, I added in my .config:

 withName: 'MUTECT2_PAIRED' {
      ext.args = { "<copy ext.args from modules.config to avoid overwritting>  --normal-sample ${meta.normal_id}" }
   }

, nothing changes.

Command used and terminal output

Error executing process > 'NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_SOMATIC_MUTECT2:MUTECT2_PAIRED (P-*-TMN-150X_vs_P-*-REF-100X)'

Caused by:
  Process `NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_SOMATIC_MUTECT2:MUTECT2_PAIRED (P-*-TMN-150X_vs_*-REF-100X)` terminated with an error exit status (2)


Command executed:

  gatk --java-options "-Xmx32768M -XX:-UsePerfData" \
      Mutect2 \
      --input P-*.recal.cram --input P-*.recal.cram \
      --output P-*-TMN-150X_vs_P-*-REF-100X.mutect2.vcf.gz \
      --reference Homo_sapiens_assembly38.fasta \
      --panel-of-normals 1000g_pon.hg38.vcf.gz \
      --germline-resource af-only-gnomad.hg38.vcf.gz \
      --intervals chr1_69066-70033.bed \
      --tmp-dir . \
      --f1r2-tar-gz P-NI62112-TMN-150X_vs_P-NI62112-REF-100X.mutect2.f1r2.tar.gz --normal-sample P-*_P-*-REF-100X
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_SOMATIC_MUTECT2:MUTECT2_PAIRED":
      gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
  END_VERSIONS

------------------------------------------------
nextflow run nf-core/sarek \
    -c config/nf_core_sarek.config \
    -profile singularity \
    --input  config/samplesheet_wes_mapped.csv \
    --outdir results/nfcore_sarek/WES/ \
    --intervals resources/Homo_sapiens/target_files/twist_human_core_exome_and_integragen_custom_v2.bed \
    --wes \
    --monochrome_logs \
    -ansi-log false \
    -params-file config/nf_core_sarek_params.yaml  \
    -resume \
    -with-report  results/nfcore_sarek/WES/report.html/ \
    -with-timeline results/nfcore_sarek/WES/timeline.html \
    -process.errorStrategy 'retry' 
    -process.maxRetries 5

Relevant files

Using GATK jar /usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx32768M -XX:-UsePerfData -jar /usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar Mutect2 --input P--REF-100X.recal.cram --input P-MA2121-TMN-150X.recal.cram --output P--TMN-150X_vs_P--REF-100X.mutect2.vcf.gz --reference Homo_sapiens_assembly38.fasta --panel-of-normals 1000g_pon.hg38.vcf.gz --germline-resource af-only-gnomad.hg38.vcf.gz --intervals chr1_69066-70033.bed --tmp-dir . --f1r2-tar-gz P--TMN-150X_vs_P--REF-100X.mutect2.f1r2.tar.gz --normal-sample P-_P-*-REF-100X
10:17:51.899 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
10:17:52.259 INFO Mutect2 - ------------------------------------------------------------
10:17:52.267 INFO Mutect2 - The Genome Analysis Toolkit (GATK) v4.5.0.0
10:17:52.267 INFO Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
10:17:52.268 INFO Mutect2 - Executing as v_robin@n11 on Linux v3.10.0-957.27.2.el7.x86_64 amd64
10:17:52.268 INFO Mutect2 - Java runtime: OpenJDK 64-Bit Server VM v17.0.10-internal+0-adhoc..src
10:17:52.268 INFO Mutect2 - Start Date/Time: September 18, 2024 at 10:17:51 AM GMT
10:17:52.268 INFO Mutect2 - ------------------------------------------------------------
10:17:52.269 INFO Mutect2 - ------------------------------------------------------------
10:17:52.270 INFO Mutect2 - HTSJDK Version: 4.1.0
10:17:52.270 INFO Mutect2 - Picard Version: 3.1.1
10:17:52.270 INFO Mutect2 - Built for Spark Version: 3.5.0
10:17:52.271 INFO Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
10:17:52.271 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
10:17:52.272 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
10:17:52.272 INFO Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
10:17:52.272 INFO Mutect2 - Deflater: IntelDeflater
10:17:52.273 INFO Mutect2 - Inflater: IntelInflater
10:17:52.273 INFO Mutect2 - GCS max retries/reopens: 20
10:17:52.273 INFO Mutect2 - Requester pays: disabled
10:17:52.274 INFO Mutect2 - Initializing engine
10:17:54.653 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/beegfs/scratch/v_robin/.tmp/nxf.PVfwj5aNd0/1000g_pon.hg38.vcf.gz
10:17:55.230 INFO FeatureManager - Using codec VCFCodec to read file file:///mnt/beegfs/scratch/v_robin/.tmp/nxf.PVfwj5aNd0/af-only-gnomad.hg38.vcf.gz
10:17:55.540 INFO FeatureManager - Using codec BEDCodec to read file file:///mnt/beegfs/scratch/v_robin/.tmp/nxf.PVfwj5aNd0/chr1_69066-70033.bed
10:17:56.168 INFO IntervalArgumentCollection - Processing 43826111 bp from intervals
10:17:56.400 INFO Mutect2 - Done initializing engine
10:17:56.519 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
10:17:56.548 INFO NativeLibraryLoader - Loading libgkl_smithwaterman.so from jar:file:/usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_smithwaterman.so
10:17:56.553 INFO SmithWatermanAligner - Using AVX accelerated SmithWaterman implementation
10:17:56.577 INFO Mutect2 - Shutting down engine
[September 18, 2024 at 10:17:56 AM GMT] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.08 minutes.
Runtime.totalMemory()=1157627904


A USER ERROR has occurred: Bad input: Sample P-_P--REF-100X is not in BAM header: [_P--REF-100X, _P--TMN-150X]

System information

Version nf-core sarek 3.4.3
Container Singularity
Executor slurm
Hardware HPC
version nextflow 24.04.4

@VivianRobin VivianRobin added the bug Something isn't working label Sep 18, 2024
@FriederikeHanssen
Copy link
Contributor

Hey! Looks like the samples in your bam have an additional prefix: MA2121_ but the custom args you set resolve it to P-NI62112_P-NI62112-REF-100X. How are you setting the samplesheet? The normal_id is retrieved from the sample columns and needs to match the normal name in the BAM file for this to work

@VivianRobin
Copy link
Author

VivianRobin commented Sep 18, 2024

Hey all my mutect2_paired are stopped, the it's a bad copy error
my samplesheet :

patient sex status sample cram crai
P-XX XX 0 P-XX-REF-100X results/nfcore_sarek/WES//preprocessing/recalibrated/P-XX-REF-100X/P-XX-REF-100X.recal.cram results/nfcore_sarek/WES//preprocessing/recalibrated/P-XX-REF-100X/P-XX-REF-100X.recal.cram.crai
P-XX XX 1 P-XX-TMN-150X results/nfcore_sarek/WES//preprocessing/recalibrated/P-XX-TMN-150X/P-XX-TMN-150X.recal.cram results/nfcore_sarek/WES//preprocessing/recalibrated/P-XX-TMN-150X/P-XX-TMN-150X.recal.cram.crai

@FriederikeHanssen
Copy link
Contributor

ok so the sample column should match exactly the name of the normal sample in the input file. Based on the error that you posed the name is not matching. Can you update the sample column with that and try again?

@VivianRobin
Copy link
Author

I have tried to start from the csv generated by sarek and to start at the variant calling stage but nothing changes.

@FriederikeHanssen
Copy link
Contributor

when you look at the error message, what does the sample name print here match what is in the sample column in the samplesheet:

A USER ERROR has occurred: Bad input: Sample xx is not in BAM header: [xx,xx]

?

@VivianRobin
Copy link
Author

VivianRobin commented Sep 18, 2024

A USER ERROR has occurred: Bad input: Sample P-XX_P-XX-REF-100X is not in BAM header: [XX_P-XX-REF-100X, XX_P-F-XX-TMN-150X] not in the samplesheet and the sample column of recalibrated.csv the samples are XX_P-XX-REF-100X, XX_P-F-XX-TMN-150X

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants