Bench vs. Collapse #217
Replies: 6 comments
-
Hello, You've made an interesting observation on something I've been considering for some time now. First, let's establish that while bench and collapse are using the same comparison engine, their functions have a key difference in their average use case: bench is matching across VCFs holding a single sample's variants while collapse is matching within a VCF holding multiple samples' variants. In the bench use case, bench needs to be prepared for larger differences between the two VCFs (e.g. resolution/accuracy of variant calls) and because typically a single sample's variants are somewhat sparse, it can favor sensitivity in finding matches over specificity and thus lower thresholds. In the collapse use case, it is assumed that variants are a bit more homogeneous (e.g. the same SV caller run across multiple samples) and because SVs typically occur in hot-spots (TRs), it prefers specificity in matching and thus higher thresholds. Having said that, I do believe that it's time to change bench's default parameters to a higher seqsim/sizesim. The 0.70 it currently has is a vestige from years ago when Truvari was first made and researchers were treating SVs like they were all CNVs called by aCGH. I think that SV calling now produces such better resolved representations (i.e. accurate long-reads, better short-read assembly/consensus) that we can raise the default standard to something like 0.85 or 0.90 for bench. However, collapse should stay at 0.95. Eventually I will make this change, but because it'd be a 'breaking change' for pipelines that use Truvari's defaults, it's not exactly easy to commit to. The moment after we raise bench defaults, there will be an influx of tickets asking "Why does truvari v5 report lower precision/recall?" Have a great day, |
Beta Was this translation helpful? Give feedback.
-
Thanks, Adam, for your patient answer. I conducted some tests previously, creating a gradient of "SV similarity" thresholds to merge redundant SVs across dozens of samples (SVs were called from high-quality assemblies). I found that the merging efficiency decreased rapidly as the thresholds were lowered. A 95% similarity threshold should be a good choice to eliminate redundant SVs. However, the current SV merging strategy employed in published papers seems to be more aggressive. As a result, the vast majority of SVs are merged, even when they only partially overlap by half or less. If you use a 95% threshold to deal with the SVs, the number will appear to be "inflated" compared to those published, and then people may question the number. So I am seeking your advice and comments on this issue and why you believe the 95% threshold is generally a good choice. Best regards, |
Beta Was this translation helpful? Give feedback.
-
In the Truvari Paper the idea of "over-merging" was a central theme. Particularly in Figure 4 we were able to demonstrate that Truvari has the least amount of over-merging and if I were to redo that experiment with Truvari v4 I believe the amount of under-merging would be lower as well. I also believe considering this paper and this paper while thinking about over-merging makes a pretty compelling case for understanding how/why over-merging occurs. However, I don't have a paper that definitively answers the discussion on how many SVs a project should be expecting, yet. So we're in the same boat as far as convincing people that we're not over-calling. All I can say is I'm involved in a few large-scale projects that may help change that, but it'll take time. Hopefully your work will also help demonstrate how diverse/prevalent SVs may be. |
Beta Was this translation helpful? Give feedback.
-
Thanks Adam, your comments give me a lot of confidence. I have to admit that, given my experiment results and the instructions in your Truvari paper, I did use 95% as the threshold for merging in my work previously, but got questioned about the number of merged SVs. Technically, I do believe SVs across individuals are very diverse and may have multiple alleles at a single locus. But for biological discovery, deciding whether SVs need to be merged or not may depend on whether they have a similar effect. Both under- or over-merging will weaken the statistical power. Anyway, thanks for your comments and the case you provided. |
Beta Was this translation helpful? Give feedback.
-
Happy to help. On the statement about number of SVs and a "biologically informed level of merging", I have a thought. While I believe it is a reasonable assumption that most SVs one would merge probably have the same impact, it is also an unproven assumption. As a hyperbolic example, imagine a GWAS study tried to merge SNPs based on if they occur in the same exon. That would have a negative effect on statistical power and is almost the level of merging most SV studies currently employ. Similarly, there's nothing preventing two insertions of equal length at the same position from having as little as a 1bp difference between them. That insertion internal SNP likely isn't functional, but we can't be sure of that until we test it. Truvari's hyper-specific merging is attempting to balance over/under merging by mainly trying to resolve SV redundancies caused by technical artifacts like alignment ambiguities. Perhaps one day we'll find that something like 90% similarity is the best balance, but until the field gives up on very loose over-merging, we'll have a hard time knowing for sure. |
Beta Was this translation helpful? Give feedback.
-
Wanted to follow up with results from the first of the projects I mentioned which might help researchers set SV QC metric expectations (e.g SV count, het/hom ratio) are in this paper https://www.biorxiv.org/content/10.1101/2024.10.22.619642v1 |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am thinking about the idea behind the bench and collapse functions. Let's say there are two SVs in a dataset; if they are deemed the same, then they are redundant and should be collapsed into one. In this case, the bench parameter and collapse parameter, such as pctsim/pctseq, pctsize, etc., should be set the same? I noticed that such parameters for bench are set at 0.7 (default) and for collapse at 0.95 (default). Would it be better to align them?
Best regards,
Beta Was this translation helpful? Give feedback.
All reactions