Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in RSEM: can't find resource file #83

Closed
slsevilla opened this issue Jan 8, 2024 · 10 comments · Fixed by #87
Closed

error in RSEM: can't find resource file #83

slsevilla opened this issue Jan 8, 2024 · 10 comments · Fixed by #87
Assignees
Labels
bug Something isn't working RENEE RepoName
Milestone

Comments

@slsevilla
Copy link
Contributor

slsevilla commented Jan 8, 2024

Running RENEE from command line:

./renee run --input /data/sevillas2/v2.5.11/.tests/*.gz --output /data/sevillas2/output_2.5.11/ --genome hg38_30 --mode slurm

Ran into the following error:

Cannot open /data/CCBR_Pipeliner/Pipelines/RENEE/resources/hg38/30/rsemref/hg38.grp! It may not exist.
[Fri Jan  5 19:11:57 2024]
Error in rule rsem:
    jobid: 0
    input: /data/sevillas2/output_2.5.11/bams/KO_S4.p2.Aligned.toTranscriptome.out.bam, /data/sevillas2/output_2.5.11/RSeQC/KO_S4.strand.info
    output: /data/sevillas2/output_2.5.11/DEG_ALL/KO_S4.RSEM.genes.results, /data/sevillas2/output_2.5.11/DEG_ALL/KO_S4.RSEM.isoforms.results
    shell:
        
    # Setups temporary directory for
    # intermediate files with built-in
    # mechanism for deletion on exit
    if [ ! -d "/lscratch/$SLURM_JOBID/" ]; then mkdir -p "/lscratch/$SLURM_JOBID/"; fi
    tmp=$(mktemp -d -p "/lscratch/$SLURM_JOBID/")
    trap 'rm -rf "${tmp}"' EXIT

    # Get strandedness to calculate Forward Probability
    fp=$(tail -n1 /data/sevillas2/output_2.5.11/RSeQC/KO_S4.strand.info | awk '{if($NF > 0.75) print "0.0"; else if ($NF<0.25) print "1.0"; else print "0.5";}')

    echo "Forward Probability Passed to RSEM: $fp"
    rsem-calculate-expression --no-bam-output --calc-ci --seed 12345          --bam --paired-end -p 16  /data/sevillas2/output_2.5.11/bams/KO_S4.p2.Aligned.toTranscriptome.out.bam /data/CCBR_Pipeliner/Pipelines/RENEE/resources/hg38/30/rsemref/hg38 /data/sevillas2/output_2.5.11/DEG_ALL/KO_S4.RSEM --time         --temporary-folder ${tmp} --keep-intermediate-files --forward-prob=${fp} --estimate-rspd
    
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Checking the file:

/data/CCBR_Pipeliner/Pipelines/RENEE/resources/hg38/30/rsemref/hg38.grp

This file does not exist. Instead, it's:

/data/CCBR_Pipeliner/Pipelines/RENEE/resources/hg38/30/rsemref/hg38_30.grp

Unsure why this file is labeled differently than expected. Looking at the other files, they are all labeled the same so this is a really odd error.

@slsevilla slsevilla added the bug Something isn't working label Jan 8, 2024
@kopardev kopardev added the RENEE RepoName label Jan 8, 2024
@kelly-sovacool
Copy link
Member

I wonder why it worked for me with v2.5.10 last week, I don't think we moved any resource files since then?

@kopardev
Copy link
Member

kopardev commented Jan 8, 2024

@slsevilla are you running on FRCE or Biowulf?
It seems to me that you are running on Biowulf and are ending up reading the FRCE JSON ... which is weirder!
Biowulf JSON has it correct as "hg38_30"
But FRCE JSON has it incorrect as "hg38" which you cannot find!
Can you retry by providing full path to the h38_30 JSON .. not just "hg38_30"?

@slsevilla
Copy link
Contributor Author

Oh weird!!! I am on Biowulf... I'll test by adding the full json path.

@slsevilla
Copy link
Contributor Author

slsevilla commented Jan 8, 2024

Running two tests

  1. Run pipeline from command line with genome information
./renee run --input /data/sevillas2/v2.5.11/.tests/*.gz --output /data/sevillas2/renee_2.5.11_genome/ --genome hg38_30 --mode slurm

Submission output as expected

RENEE (v2.5.3)
Thank you for running RENEE on BIOWULF!

Generating config file in '/data/sevillas2/renee_2.5.11_genome/config.json'... Done!
/data/sevillas2/renee_2.5.11_genome/resources/runner slurm -j pl:renee -b /gpfs/gsfs10/users/CCBR_Pipeliner,/data/CCBR_Pipeliner,/gpfs/gsfs12/users/sevillas2/v2.5.11/.tests,/data/sevillas2/renee_2.5.11_genome,/lscratch -o /data/sevillas2/renee_2.5.11_genome -c /data/sevillas2/renee_2.5.11_genome/.singularity -t /lscratch/$SLURM_JOBID/ -n biowulf
Successfully submitted master job: TRIGGEROPTIONS:--rerun-triggers mtime
16693264
  1. Run pipeline from command line with json information
./renee run --input /data/sevillas2/v2.5.11/.tests/*.gz --output /data/sevillas2/renee_2.5.11_json/ --genome /data/sevillas2/v2.5.11/config/genomes/biowulf/hg38_30.json --mode slurm

Submission output as expected

Thank you for running RENEE on BIOWULF!

Generating config file in '/data/sevillas2/renee_2.5.11_json/config.json'... Done!
/data/sevillas2/renee_2.5.11_json/resources/runner slurm -j pl:renee -b /data/CCBR_Pipeliner,/gpfs/gsfs10/users/CCBR_Pipeliner,/gpfs/gsfs12/users/sevillas2/v2.5.11/.tests,/data/sevillas2/renee_2.5.11_json,/lscratch -o /data/sevillas2/renee_2.5.11_json -c /data/sevillas2/renee_2.5.11_json/.singularity -t /lscratch/$SLURM_JOBID/ -n biowulf
Successfully submitted master job: TRIGGEROPTIONS:--rerun-triggers mtime
16693269

kelly-sovacool added a commit that referenced this issue Jan 9, 2024
@slsevilla
Copy link
Contributor Author

Both submissions failed with the same error:

Activating singularity image /gpfs/gsfs12/users/sevillas2/renee_2.5.11_genome/.snakemake/singularity/c06d00499469a82e355b3fb78cc49e84.simg
Cannot open /data/CCBR_Pipeliner/Pipelines/RENEE/resources/hg38/30/rsemref/hg38.grp! It may not exist.

@kelly-sovacool
Copy link
Member

I think I figured it out!

The renee script is using a resources dir outside the git repo (see #54) and in that dir is a json file with the incorrect rsemref path: /data/CCBR_Pipeliner/Pipelines/RENEE/resources/hg38/30/config/genomes/biowulf/hg38_30.json.

We need to not use this shared resources dir except for large files that cannot be version controlled (e.g. the kraken db).

@slsevilla
Copy link
Contributor Author

Attempted to run from @kelly-sovacool's dev folder but ran into this error

[sevillas2@biowulf renee-dev-sovacool]$ ./renee run --input /data/sevillas2/v2.5.11/.tests/*.gz --output /data/sevillas2/renee_2.5.11_kelly/ --genome hg38_30 --mode slurm
RENEE (v2.5.10)
Thank you for running RENEE on BIOWULF!

Generating config file in '/data/sevillas2/renee_2.5.11_kelly/config.json'... Done!
/data/sevillas2/renee_2.5.11_kelly/resources/runner slurm -j pl:renee -b /gpfs/gsfs10/users/CCBR_Pipeliner,/data/CCBR_Pipeliner,/gpfs/gsfs12/users/sevillas2/v2.5.11/.tests,/data/sevillas2/renee_2.5.11_kelly,/lscratch -o /data/sevillas2/renee_2.5.11_kelly -c /data/sevillas2/renee_2.5.11_kelly/.singularity -t /lscratch/$SLURM_JOBID/ -n biowulf
Traceback (most recent call last):
  File "/gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/RENEE/renee-dev-sovacool/./renee", line 2475, in <module>
    main()
  File "/gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/RENEE/renee-dev-sovacool/./renee", line 2471, in main
    args.func(args)
  File "/gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/RENEE/renee-dev-sovacool/./renee", line 1157, in run
    open(os.path.join(sub_args.output, "logfiles", "mjobid.log")).read().strip()
FileNotFoundError: [Errno 2] No such file or directory: '/data/sevillas2/renee_2.5.11_kelly/logfiles/mjobid.log'

@kelly-sovacool
Copy link
Member

Weird, does that file really not exist? Wonder if it's a latency error?

@slsevilla
Copy link
Contributor Author

Good guess - ran again and this time it submitted the job.

@kelly-sovacool
Copy link
Member

Vishal was also dealing with latency errors in another pipeline. Seems Biowulf is having issues.

@kopardev kopardev modified the milestones: Jan 2024, Jan 2025 Jan 24, 2024
@kelly-sovacool kelly-sovacool modified the milestones: Jan 2025, Jan 2024 Jan 24, 2024
@kopardev kopardev modified the milestones: Jan 2024, 2024-01 Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working RENEE RepoName
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants