Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postprocess_variants step is erroring out in quickstart example #901

Open
sushruta opened this issue Oct 31, 2024 · 3 comments
Open

postprocess_variants step is erroring out in quickstart example #901

sushruta opened this issue Oct 31, 2024 · 3 comments

Comments

@sushruta
Copy link

Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.6.1/docs/FAQ.md:
Yes

Describe the issue:
(A clear and concise description of what the issue is.)
run_deepvariant is erroring out in the postprocess_variants step.

Setup

  • Operating system: Running inside docker image - google/deepvariant:1.6.0-gpu
  • DeepVariant version: 1.6.0
  • Installation method (Docker, built from source, etc.): Docker image - google/deepvariant:1.6.0-gpu
  • Type of data: (sequencing instrument, reference genome, anything special that is unlike the case studies?)

Steps to reproduce:

  • Command: Running the quickstart cmd --
/opt/deepvariant/bin/run_deepvariant --model_type=WGS --ref=/opt/deepvariant/quickstart-testdata/ucsc.hg19.chr20.unittest.fasta --reads=/opt/deepvariant/quickstart-testdata/NA12878_S1.chr20.10_10p1mb.bam --regions "chr20:10,000,000-10,010,000" --output_vcf=/opt/deepvariant/quickstart-output/output.vcf.gz --output_gvcf=/opt/deepvariant/quickstart-output/output.g.vcf.gz --intermediate_results_dir /opt/deepvariant/quickstart-output/intermediate_results_dir --num_shards=1 --verbosity=2
  • Error trace: (if applicable) In the postprocess_variants step
***** Running the command:*****
time /opt/deepvariant/bin/postprocess_variants --ref "/opt/deepvariant/quickstart-testdata/ucsc.hg19.chr20.unittest.fasta" --infile "/opt/deepvariant/quickstart-output/intermediate_results_dir/call_variants_output.tfrecord.gz" --outfile "/opt/deepvariant/quickstart-output/output.vcf.gz" --cpus "1" --gvcf_outfile "/opt/deepvariant/quickstart-output/output.g.vcf.gz" --nonvariant_site_tfrecord_path "/opt/deepvariant/quickstart-output/intermediate_results_dir/gvcf.tfrecord@1.gz"

2024-10-31 20:36:34.101345: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libcublas.so.12: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2024-10-31 20:36:34.101375: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2024-10-31 20:36:35.010025: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2027] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 9.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I1031 20:36:35.011695 132485076334400 postprocess_variants.py:1211] Using sample name from call_variants output. Sample name: NA12878
I1031 20:36:35.013445 132485076334400 postprocess_variants.py:1313] CVO sorting took 1.1885166168212891e-05 minutes
I1031 20:36:35.013573 132485076334400 postprocess_variants.py:1316] Transforming call_variants_output to variants.
I1031 20:36:35.014770 132485076334400 postprocess_variants.py:1211] Using sample name from call_variants output. Sample name: NA12878
Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/postprocess_variants.py", line 1419, in <module>
    app.run(main)
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/absl_py/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/absl_py/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/postprocess_variants.py", line 1385, in main
    tmp_variant_file = dump_variants_to_temp_file(variant_generator)
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/postprocess_variants.py", line 1067, in dump_variants_to_temp_file
    tfrecord.write_tfrecords(variant_protos, temp.name)
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/third_party/nucleus/io/tfrecord.py", line 190, in write_tfrecords
    for proto in protos:
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/haplotypes.py", line 91, in maybe_resolve_conflicting_variants
    for overlapping_candidates in _group_overlapping_variants(sorted_variants):
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/haplotypes.py", line 111, in _group_overlapping_variants
    for variant in sorted_variants:
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/postprocess_variants.py", line 1062, in _transform_call_variants_output_to_variants
    yield _transform_call_variant_group_to_output_variant(**cvo_group_kwargs)
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/postprocess_variants.py", line 1036, in _transform_call_variant_group_to_output_variant
    return add_call_to_variant(
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/postprocess_variants.py", line 434, in add_call_to_variant
    gq, variant.quality = compute_quals(predictions, index)
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/deepvariant/postprocess_variants.py", line 469, in compute_quals
    genomics_math.ptrue_to_bounded_phred(predictions[prediction_index])
  File "/tmp/Bazel.runfiles_in6znu90/runfiles/com_google_deepvariant/third_party/nucleus/util/genomics_math.py", line 143, in ptrue_to_bounded_phred
    raise ValueError('ptrue must be between zero and one: {}'.format(ptrue))
ValueError: ptrue must be between zero and one: nan

Does the quick start test work on your system?
Please test with https://github.com/google/deepvariant/blob/r1.6/docs/deepvariant-quick-start.md.
Is there any way to reproduce the issue by using the quick start?

Any additional context:

I'm running this code on an H100 GPU running nvidia driver - 535.183.06 and CUDA version is 12.2

@kishwarshafin
Copy link
Collaborator

@sushruta can you please post the full command you are running?

@sushruta
Copy link
Author

sushruta commented Nov 4, 2024

Here's the full command --

/opt/deepvariant/bin/run_deepvariant --model_type=WGS \
  --ref=/opt/deepvariant/quickstart-testdata/ucsc.hg19.chr20.unittest.fasta \
  --reads=/opt/deepvariant/quickstart-testdata/NA12878_S1.chr20.10_10p1mb.bam \
  --regions "chr20:10,000,000-10,010,000" \
  --output_vcf=/opt/deepvariant/quickstart-output/output.vcf.gz \
  --output_gvcf=/opt/deepvariant/quickstart-output/output.g.vcf.gz \
  --intermediate_results_dir /opt/deepvariant/quickstart-output/intermediate_results_dir \
  --num_shards=1 --verbosity=2

@pichuan
Copy link
Collaborator

pichuan commented Nov 7, 2024

Hi @sushruta ,
I wonder if this could related to #849

If so, the best solution might be to wait for our next release, which should hopefully be out not too long from now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants