Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all shards generated #776

Closed
ErinKinghorn opened this issue Feb 23, 2024 · 5 comments
Closed

Not all shards generated #776

ErinKinghorn opened this issue Feb 23, 2024 · 5 comments

Comments

@ErinKinghorn
Copy link

Hi

I have used the following script to run deepvariant (v1.6.0) on WGS samples.

singularity exec -H $(pwd) docker://google/deepvariant:1.6.0 \
  /opt/deepvariant/bin/run_deepvariant \
  --model_type=${6} \
  --ref=./human_g1k_v37_decoy.fasta \
  --reads=./${2}_md.recal.cram \
  --output_vcf=./${2}_hg37.dv.vcf.gz \
  --output_gvcf=./${2}_hg37.dv.g.vcf.gz \
  --make_examples_extra_args="min_mapping_quality=1,keep_legacy_allele_counter_behavior=true,normalize_reads=true" \
  --num_shards=32

Of the 30 samples I have, only 4 have not completed. I believe this is due to the 32nd shard not being generated in the temporary directory. All of the four samples that have not completed have the same error in the .log regarding the 32nd shard. The error is as follows:

***** Running the command:*****
time /opt/deepvariant/bin/call_variants --outfile "/scratch3/users/kngeri004/b37/deepvar/tmp/tmp6uy3ir10/call_variants_output.tfrecord.gz" --examples "/scratch3/users/kngeri004/b37/deepvar/tmp/tmp6uy3ir10/make_examples.tfrecord@32.gz" --checkpoint "/opt/models/wgs"

/usr/local/lib/python3.8/dist-packages/tensorflow_addons/utils/tfa_eol_msg.py:23: UserWarning: 

TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). 

For more information see: https://github.com/tensorflow/addons/issues/2807 

  warnings.warn(
I0219 07:48:12.876999 139989302617920 call_variants.py:471] Total 1 writing processes started.
W0219 07:48:12.885284 139989302617920 call_variants.py:482] Unable to read any records from /scratch3/users/kngeri004/b37/deepvar/tmp/tmp6uy3ir10/make_examples.tfrecord@32.gz. Output will contain zero records.
I0219 07:48:12.885881 139989302617920 call_variants.py:623] Complete: call_variants.

While the run has not errored out, I do believe that there is an issue here and would appreciate if anyone has any insight.

Regards
Erin

@danielecook
Copy link
Collaborator

@ErinKinghorn somewhat confusingly, shard specifies are 0-based for the first number (shard index) and 1-indexed for the second (count).

So your first shard is 00000-of-00032 and your last shard is 00031-of-00032. Can you confirm that you indeed are observing only 31 output files?

The message you report here looks normal - and the warning should not make a difference.

@kishwarshafin
Copy link
Collaborator

@ErinKinghorn ,

Once you follow @danielecook 's suggestion to see if the files look good. If you in fact find the files look normal, can you please test this docker:

docker pull google/deepvariant:CL602468145
docker pull google/deepvariant:CL602468145-gpu

It seems like this issue is related to a this issue: #769

@ErinKinghorn
Copy link
Author

Thank you for the feedback! @danielecook I can confirm that the files in the tmp directory do look to be normal as you described above. @kishwarshafin, I tested the docker that you suggested and now it seems that Deepvariant did not run at all (vcfs and gvcfs are empty)
deepvarrun_b37_MND_G33.1kei.log

I have attached a log file for one of the samples, so you can see what happened.

@ErinKinghorn
Copy link
Author

I have picked up an error in the original cram files for the 4 samples that did not work. Thank you for the help!

@pichuan
Copy link
Collaborator

pichuan commented Mar 12, 2024

Hi @ErinKinghorn , if I understand your latest comment, you meant that you were able to get them to work now?
If so, I'll close this. (But if I misunderstood, please reopen with more questions!)

@kishwarshafin will plan to do a 1.6.1 release to fix the issue above (and will officially publish a Docker). Thanks for helping us test!

@pichuan pichuan closed this as completed Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants