WARN VariantContextConverter:924 - Ran into Array Out of Bounds when accessing indices 0,1,2 of genotype . #2024

rezacsedu · 2018-08-01T14:38:44Z

Hi there,

I'm trying to extract genomics data from VCF files. However, I'm experiencing the following warning.

2018-08-01 16:31:25 WARN VariantContextConverter:924 - Ran into Array Out of Bounds when accessing indices 0,1,2 of genotype [NA21126 A* GQ 99 PL 0,677 {CN=1, CNL=-1000,0,-67.74, CNP=-1000,0,-70.53, CNQ=99, GP=0,-70.53}].

I'm seeing 1000s of warnings like this. Is it a big issue or will it return fewer quality features? Please note that I'm using the following software versions:

Apache Spark: v2.3.0,
H2O: v3.14.0.1
Sparkling Water: v1.2.5
ADAM: v0.22.0
Scala: 2.11.8

heuermh · 2018-08-01T15:01:10Z

Generally, you may tweak validation for flat file formats using ValidationStringency

val genotypes = sc.loadGenotypes("sample0.vcf", stringency = ValidationStringency.SILENT)

As to that specific warning, the line number on the 0.22.0 release branch doesn't look like what I would expect, perhaps you might be on a different version?

In any case, if you could extract a short bit of VCF that demonstrates the warning, we could investigate further.

rezacsedu · 2018-08-01T15:07:45Z

@heuermh sorry for the wrong versions but here are the exact versions I'm using:

	<properties>
		<spark.version>2.2.1</spark.version>
		<scala.version>2.11.12</scala.version>
		<h2o.version>3.16.0.2</h2o.version>
		<sparklingwater.version>2.2.6</sparklingwater.version>
		<adam.version>0.23.0</adam.version>
	</properties>

heuermh · 2018-08-01T15:14:54Z

Yep, this line looks better https://github.com/bigdatagenomics/adam/blob/maint_spark2_2.11-0.23.0/adam-core/src/main/scala/org/bdgenomics/adam/converters/VariantContextConverter.scala#L924

It appears your data might have the wrong cardinality for the PL Number=G VCF FORMAT field
https://github.com/samtools/hts-specs/blob/master/VCFv4.3.tex#L399

Or perhaps we're doing something wrong. 😄

rezacsedu · 2018-08-01T15:18:03Z

Actually, I'm using the genetic variants data from the 1000 Genomes Project. However, when I use the following old versions, I don't experience such warnings:

    <properties>
        <spark.version>1.2.0</spark.version>
        <h2o.version>3.0.0.8</h2o.version>
        <sparklingwater.version>1.2.5</sparklingwater.version>
        <adam.version>0.16.0</adam.version>
</properties>

heuermh · 2018-08-01T19:11:58Z

I'm using the genetic variants data from the 1000 Genomes Project

That doesn't mean they're adhering correctly to the VCF specification. ;)

Which 1000G file(s) are you looking at, specifically?

when I use the following old versions, I don't experience such warnings

There have been 868 commits since version 0.16.0; I'd recommend trying newer versions rather than older ones.

heuermh · 2020-01-06T20:36:24Z

Closing due to lack of context for the error. Please reopen if you can provide an example 1000G file that causes the issue.

heuermh closed this as completed Jan 6, 2020

heuermh added this to the 0.31.0 milestone Jan 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WARN VariantContextConverter:924 - Ran into Array Out of Bounds when accessing indices 0,1,2 of genotype . #2024

WARN VariantContextConverter:924 - Ran into Array Out of Bounds when accessing indices 0,1,2 of genotype . #2024

rezacsedu commented Aug 1, 2018 •

edited

Loading

heuermh commented Aug 1, 2018

rezacsedu commented Aug 1, 2018

heuermh commented Aug 1, 2018 •

edited

Loading

rezacsedu commented Aug 1, 2018

heuermh commented Aug 1, 2018

heuermh commented Jan 6, 2020

WARN VariantContextConverter:924 - Ran into Array Out of Bounds when accessing indices 0,1,2 of genotype . #2024

WARN VariantContextConverter:924 - Ran into Array Out of Bounds when accessing indices 0,1,2 of genotype . #2024

Comments

rezacsedu commented Aug 1, 2018 • edited Loading

heuermh commented Aug 1, 2018

rezacsedu commented Aug 1, 2018

heuermh commented Aug 1, 2018 • edited Loading

rezacsedu commented Aug 1, 2018

heuermh commented Aug 1, 2018

heuermh commented Jan 6, 2020

rezacsedu commented Aug 1, 2018 •

edited

Loading

heuermh commented Aug 1, 2018 •

edited

Loading