Beneficial to merge all processed vcfs into one prior to annotating? #2

thomasyu888 · 2020-04-27T18:58:10Z

average distribution of variants of VCFs:

wc -l vcf/processed/*
2 GENIE-
2 GENIE-
5 GENIE-
1 GENIE-
2 GENIE-
3 GENIE-
2 GENIE-
...

The text was updated successfully, but these errors were encountered:

thomasyu888 · 2020-04-27T19:20:49Z

Angelica informed me that there isn't really much difference in speed and that the first time running through the annotation will be longer due to cacheing of the results in mongoDB.

I experienced - around 400 VCFs an hour first time running through the site. (Granted the process probably sped up if there are duplicated variants)

sheridancbio · 2020-04-27T20:34:24Z

I think it would be a good idea to do some real performance comparisons ... seeing how long a run for a center using hundreds of small vcf files take to complete. And compare that to an annotation run for the same center using a single merged maf (perhaps from the output of the first run) as the input.

ao508 · 2020-04-28T17:34:43Z

Based on a previous conversation with @thomasyu888 the original decision to annotate the MAFs individually first before generating a "per-center" MAF was based on whether the suggested approach would be scalable or not. I believe it's been decided to do some performance tests first based on the currently available GENIE data. Performance tests will be:

total time to standardize and annotate individual MAFs from a center
total time to standardize and merge individual MAFs into a "per-center" MAF and then run that through the annotator

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Beneficial to merge all processed vcfs into one prior to annotating? #2

Beneficial to merge all processed vcfs into one prior to annotating? #2

thomasyu888 commented Apr 27, 2020

thomasyu888 commented Apr 27, 2020 •

edited

Loading

sheridancbio commented Apr 27, 2020

ao508 commented Apr 28, 2020

Beneficial to merge all processed vcfs into one prior to annotating? #2

Beneficial to merge all processed vcfs into one prior to annotating? #2

Comments

thomasyu888 commented Apr 27, 2020

thomasyu888 commented Apr 27, 2020 • edited Loading

sheridancbio commented Apr 27, 2020

ao508 commented Apr 28, 2020

thomasyu888 commented Apr 27, 2020 •

edited

Loading