-
Notifications
You must be signed in to change notification settings - Fork 110
Do we need the GA prefix on all objects? #87
Comments
One thing to consider. Having a prefix on each object reduces the chance of namespace collisions. Developers will likely build on more than just the GA4GH libraries and use others as well. Those other libraries may have an object called e.g. ReadAlignment. In that case, you will need to differentiate between the two using the full-path. |
-0, the advantage of removing the GA prefix is that you remove redundancy and simplify the class name, but the disadvantage is that you can run into overlapping class names from multiple packages the implement similar functionality (e.g., net.sourceforge.samtools.SAMRecord vs. org.bdgenomics.formats.ADAMRecord). |
I see. In proto you have to fully qualify anything not in the same package. So you'd have to type out org.ga4gh.x no matter what. |
It depends on how you're importing it in each language; if you imported the records in C++, you'd just namespace them, or in Scala you'd just rename on import:
It's not an insurmountable issue, but it's a bit frustrating to have your interop code be inconsistent with the rest of your codebase because you need to use full namespace paths/etc to refer to classes. |
btw - some of the API providers will still be outputting json - will you be able to load those into your Avro objects easily? |
@cassiedoll I believe you can create Avro directly from JSON in both Java and C++. I'm not sure about the other avro language bindings. |
excellent - thanks frank! |
-1 because of 1) auto-generated code and 2) name collisions. |
Avoiding namespace collisions is important, but wouldn't this be relatively rare? If the common case we expect is not to have conflicting namespaces in end-user code, then it seems a bit heavy-handed to force the GA prefix on everyone. How hard is it to work around namespace conflicts in any case? In Python, it would be easy to sort out using "import X as Y" or whatever. |
IMHO adding prefixes to the names when there is a namespace collision will On Mon, Jul 7, 2014 at 1:30 AM, Jerome Kelleher notifications@github.com
|
Maybe I'm just too used to Google's style - we never prefix anything :) |
IMHO collisions should rarely happen with well-written code. In any case, there is @namespace() if one would like specify the namespace. Just my $0.02 :) |
@pgrosu conceptually, I agree. However, if we're exporting data into/out of the API, we may need to interoperate with other libraries that also use the *Record name (e.g., net.sf.samtools.SAMRecord, org.bdgenomics.format.avro.ADAMRecord). |
Thanks @fnothaft for the clarification - now that makes more sense regarding the direction we want to head toward. |
@fnothaft wouldn't it be better to fully qualify the class references in the (surely relatively rare?) cases when there is a namespace conflict, than to force all code to use a prefix? I think the common case should be as easy and intuitive as possible - I certainly found the GA prefixes confusing and unnecessary when I first looked at the API. Perhaps I'm misunderstanding what the common case will be though? |
For what it is worth, I am also in favor of this (although I don't have voting authority); see e.g. pull request #94. |
+1 -- I think the benefits of a more readable API for everyone outweigh the costs of occasionally, in some programming contexts, needing to do explicit namespacing. |
+1 -- agree benefits (far) outweigh the costs. This is not C, we have well On Wed, Jul 16, 2014 at 6:28 PM, David Glazer notifications@github.com
|
We have no consensus here, so I'm closing this issue. |
After discussing this with @fnothaft, @massie and @vadimzalunin it seems we might be a bit closer to consensus on this - perhaps we could have another vote? It would be good to discuss in person in San Diego if we are still divided on the topic. I'm +1 for removing the prefix. |
I would be OK with removing the prefix if that is the consensus. So a cautious +1. Richard On 15 Oct 2014, at 01:41, Jerome Kelleher notifications@github.com wrote:
The Wellcome Trust Sanger Institute is operated by Genome Research |
@jeromekelleher - Wait, you convinced @fnothaft and @massie regarding this? :) So if we have multiple libraries with the same object names, then how do we differentiate between them? |
I mildly prefer to keep the GA prefix when Avro itself does not have a concept of namespace (if I am right). When I read code, the prefix immediately tells me the source of the APIs, which to me is useful. Nonetheless, I don't mind if the consensus is to remove the prefix. -0 |
Conceptually, I like removing the prefix. There is a concerned |
There is some advantage for keeping it in metadata, where the GA will unanimously point to our interpretation of value range in this context (see the lengthy GAsex related exchanges) I understand that this is different for reads/file formats, where the contributions are meant even more general (GA or not). Michael
|
+1
We've de-prefixed ADAM over the past few months: bigdatagenomics/adam#418, bigdatagenomics/adam#342... |
+1 for removing most prefixes, but I do agree that some of the metadata records can stand to keep it so we stay out of developing and clashing with Ontologies. |
0.1 and 0.5 proved prefixes less useful the I anticipated before, so changing mine -1 to +0. |
It seems that a lot of people are in favour of removing the GA prefix, and no one is strongly against the idea. The prefix is certainly useful in some situations, but I think that our overriding concern should be for end-users writing client code. In this case, we need to ensure that the API is as easy as possible to learn and understand, and the GA prefix is a major obstacle to this in my view. @mbaudis and @delagoya - I don't quite follow why metadata is different. Why is |
@jeromekelleher, I think it comes down to the idea that if a set of all possible attributes for a class in GA4GH becomes a superset - and its has specific properties in the way it was defined - where the majority of users might only use a subset of the same class. Then the superset sort of becomes its own class. I think this discussion ( https://github.com/cassiedoll/schemas/commit/51cd6ece5c83fc8076959a3be997190303718170#commitcomment-7663742 ) - as one of several that @mbaudis alluded to - helps to illustrate that. I think with some classes, we're working on first defining the class, attributes and their interpretation (including their relation to other classes), in order to collect all the variations under those definitions, which strives to behave as sets of controlled vocabularies. |
@pgrosu Well said. While the domain points to the technical definition of the attribute, the prefix makes it more obvious that the contained values may need some consideration. That said, the "GA" here is of a more psychological nature and technically not necessary if the context is provided through the org.ga4gh domain. I am not against removing it; there always will be trade-offs ... And we still can change the names of the attributes (org.ga4gh.geneticSexAsWeSeeIt) ;-) |
+1
|
@mbaudis, @delagoya, OK, this seems sensible. Perhaps we should make a PR without all the GAs and see how things look? We can add some back if it looks like we are trying to be too all-encompassing with the definitions. This would be a pretty disruptive change, so perhaps we should wait until some of the outstanding PRs are merged? |
Let's definitely make a PR - it looks like everybody is now on board. I think we're always going to have conflicting PRs, so we don't have to really wait. Whoever has free time can pick this up :) |
Closing this since a fix has been merged. |
All of our Avro files are namespaced under ga4gh, so a read alignment is currently
org.ga4gh.GAReadAlignment
So the GA seems a little redundant from a packaging perspective, and is making all of the code/docs a bit harder to read (and type :)
What does everyone think about removing GA from everything?
We'd then have
org.ga4gh.ReadAlignment
The text was updated successfully, but these errors were encountered: