Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make InfoFieldAnnotation/GenotypeAnnotation into interfaces #7041

Merged
merged 1 commit into from
Feb 23, 2021

Conversation

bbimber
Copy link
Contributor

@bbimber bbimber commented Jan 14, 2021

@jamesemery This is related to #6930

The background is that PedigreeAnnotation is special-cased in GATK, which provides better command-line argument validation, and it will also be used to inject the PedigreeFile, create the SampleDB, etc. This is currently a subclass of InfoFieldAnnotation, and therefore cant be used for GenotypeFieldAnnotation. There shouldnt be this limitation, and this PR tried to address that.

The way I propose to do this is to make InfoFieldAnnotation and GenotypeAnnotation into interfaces, with default methods where possible. The existing subclasses all switch from extending them to implementing them. This is generally a trivial difference, but it touches a lot of classes.

All existing classes that previously extended PedigreeAnnotation (formerly a subclass of InfoFieldAnnotation), now extend PedigreeAnnotation and implement InfoFieldAnnotation. This is a minimal difference, but it makes it possible for future classes to extend PedigreeAnnotation, and then implement GenotypeAnnotation.

The only part this includes that I didnt like was the fact that the existing InfoFieldAnnotation overrides toString(), which I cant do in an interface. So I created AbstractInfoFieldAnnotation, and all existing InfoFieldAnnotation classes extend that. It's not currently clear to me how critical that override of toString() is. The weakness of this PR is that classes outside the GATK project that currently extend InfoFieldAnnotation would not inherit this. I could keep InfoFieldAnnotation a class as-is, and make a differently named interface behind it.

Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a particularly strong opinion on these changes. It looks like there was virtually no method inheritance either way so the difference seems slight. @cmnbroad do you have any opinions or do you think we can merge this?

Copy link
Collaborator

@cmnbroad cmnbroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I think I understand the motivation, I don't think we should take this change, at least in its current form. I'm guessing that the existing toString override in VariantAnnotation was put in there to provide a canonical way to get an annotation's name from the annotation class itself, but thats just a guess. I see a lot of code that calls getClass().getSimpleName() directly on annotations, i.e., GATKAnnotationPluginDescriptor, HaplotypeCallerEngine, etc.

@ldgauthier any thoughts on any of this - do you know why that toString() override is there ? Is it common to rely on toString() to get the name of an annotation ?

If not, can we just get rid of the toString() override ? It would eliminate the need for AbstractInfoFieldAnnotation.

On the other hand, if we need to keep the override because code relies on it, we'd need to ensure that all subclasses of VariantAnnotation, i.e., GenotypeAnnotation implement it, since as the PR currently stands, that no longer overrides toString().

Ideally, the annotation interface, or VariantAnnotation at least, would define a method for getting the canonical name...

package org.broadinstitute.hellbender.tools.walkers.annotator;

/**
* Superclass of all variant annotations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not all variant annotations, just info field annotations.

@Override
public String toString() {
return getClass().getSimpleName();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this method means that other subclasses, such as GenotypeAnnotation will no longer have this override as well. See my overall comments.

@bbimber
Copy link
Contributor Author

bbimber commented Jan 25, 2021

Reasonable point. When I get to my computer I'll check this out, but what about adding an explicit new method on the interface for getSimpleName (name TBD) that serves this function, with a default implementation. Then we no longer rely on the override of toString?

@bbimber
Copy link
Contributor Author

bbimber commented Jan 26, 2021

@cmnbroad After looking more, I have two comments:

  1. I agree that override of toString() was awkward. @ldgauthier or someone else at GATK might have more comments, but I think it can be dropped.

  2. Some kind of getSimpleName() method directly on the interface, defaulting to getClass().getSimpleName() would be reasonable; however, it needs to be on Annotation, not VariantAnnotation, to be useful.

In this updated PR, I remoted the override of toString(), removed AbstractInfoFieldAnnotation, and left just the two interfaces: InfoFieldAnnotation and GenotypeAnnotation. The results in a relatively minor overall change.

@bbimber
Copy link
Contributor Author

bbimber commented Feb 5, 2021

@cmnbroad Any further thoughts on this? when removing the toString() override (which did not appear to do anything so far as I could see), this Class -> Interface change ends up being fairly minor.

@cmnbroad
Copy link
Collaborator

cmnbroad commented Feb 8, 2021

@ldgauthier - Any thoughts on whether removing toString() from VariantAnnotation is safe - see comments above ? If not, we plan to remove it since there doesn't appear to be any code in GATK that relies on it (at least, no tests fail).

@bbimber
Copy link
Contributor Author

bbimber commented Feb 17, 2021

Hello @cmnbroad @ldgauthier - just following up here. It seems like we have passing tests and general approval of this change. I believe the only question is whether removing toString() from VariantAnnotation is safe. As noted above it doesnt seem to be needed.

@cmnbroad
Copy link
Collaborator

@ldgauthier One more try - any objections to this change ? If not we're going to remove the toString default.

Copy link
Collaborator

@cmnbroad cmnbroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now. @bbimber can you rebase on master and then I'll merge this when the tests pass.

…lows PedigreeAnnotation to be more abstract, and would allow for both InfoFieldAnnotations and GenotypeAnnotations to use it.

Remove toString() override

Missed with last commit

Remove default method
@bbimber
Copy link
Contributor Author

bbimber commented Feb 23, 2021

@cmnbroad ok, squash/rebase done

@bbimber
Copy link
Contributor Author

bbimber commented Feb 23, 2021

@cmnbroad tests seem to be passing

@cmnbroad cmnbroad merged commit 66d1614 into broadinstitute:master Feb 23, 2021
@bbimber bbimber deleted the annotations branch February 23, 2021 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants