Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test profiles #6

Closed
nschcolnicov opened this issue Mar 18, 2024 · 5 comments
Closed

Fix test profiles #6

nschcolnicov opened this issue Mar 18, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@nschcolnicov
Copy link
Contributor

Description of the bug

Test profiles currently contain local paths, i.e test_full.config
image

The test profiles that need correcting are:
test_full.config
test_panelprep.config
test_sim.config
test.config

Command used and terminal output

No response

Relevant files

No response

System information

dev

@nschcolnicov nschcolnicov added the bug Something isn't working label Mar 18, 2024
@atrigila
Copy link
Collaborator

Example of issues: CSVs point to local files and cannot be used for testing: e.g.
/groups/dog/llenezet/test-datasets/data/panel/21/panel_2020-08-05_chr21.phased.vcf.gz

@LouisLeNezet
Copy link
Collaborator

Sorry I didn't yet implement a big test as I needed first a reliable datatest set.
We should look at how it is done in other pipeline to know where big files are stored.

@LouisLeNezet LouisLeNezet self-assigned this Mar 18, 2024
@LouisLeNezet
Copy link
Collaborator

Hi,
Normally the nextflow run main.nf -profile test,singularity --outdir results should now work without any problem.

@atrigila
Copy link
Collaborator

Hi @LouisLeNezet, here are some ideas of full sized datasets. I implemented the 1000G s3 in the quilt pipeline.

@LouisLeNezet
Copy link
Collaborator

For the fasta it is ready, same for reference panel with the #18 PR.
For the sample the NA12878 is easily accessible but the problem reside in the presence of this individual in the reference panel as well as its parents. For a full test it will imply to duplicate the huge files to remove them to not overestimate the performance of the imputation.
The best would be to have a unrelated bam file at high coverage from outside the 1000 Genome Project.
The GATK resources seems interesting but there is only the NA12878 individual available...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants