Do we need the GA prefix on all objects? #87

cassiedoll · 2014-07-02T17:56:46Z

All of our Avro files are namespaced under ga4gh, so a read alignment is currently org.ga4gh.GAReadAlignment

So the GA seems a little redundant from a packaging perspective, and is making all of the code/docs a bit harder to read (and type :)

What does everyone think about removing GA from everything?
We'd then have org.ga4gh.ReadAlignment

The text was updated successfully, but these errors were encountered:

massie · 2014-07-02T18:01:24Z

One thing to consider. Having a prefix on each object reduces the chance of namespace collisions.

Developers will likely build on more than just the GA4GH libraries and use others as well. Those other libraries may have an object called e.g. ReadAlignment. In that case, you will need to differentiate between the two using the full-path.

fnothaft · 2014-07-02T18:03:23Z

-0, the advantage of removing the GA prefix is that you remove redundancy and simplify the class name, but the disadvantage is that you can run into overlapping class names from multiple packages the implement similar functionality (e.g., net.sourceforge.samtools.SAMRecord vs. org.bdgenomics.formats.ADAMRecord).

cassiedoll · 2014-07-02T18:13:03Z

I see. In proto you have to fully qualify anything not in the same package. So you'd have to type out org.ga4gh.x no matter what.

fnothaft · 2014-07-02T18:15:16Z

It depends on how you're importing it in each language; if you imported the records in C++, you'd just namespace them, or in Scala you'd just rename on import:

import org.ga4gh.{AlignedRead => GAAlignedRead}
import someone.elses.api.{AlignedRead => OtherAlignedRead}

It's not an insurmountable issue, but it's a bit frustrating to have your interop code be inconsistent with the rest of your codebase because you need to use full namespace paths/etc to refer to classes.

cassiedoll · 2014-07-02T18:39:45Z

btw - some of the API providers will still be outputting json - will you be able to load those into your Avro objects easily?

fnothaft · 2014-07-02T20:34:48Z

@cassiedoll I believe you can create Avro directly from JSON in both Java and C++. I'm not sure about the other avro language bindings.

cassiedoll · 2014-07-02T21:28:12Z

excellent - thanks frank!

vadimzalunin · 2014-07-03T08:25:28Z

-1 because of 1) auto-generated code and 2) name collisions.

jeromekelleher · 2014-07-07T08:30:57Z

Avoiding namespace collisions is important, but wouldn't this be relatively rare? If the common case we expect is not to have conflicting namespaces in end-user code, then it seems a bit heavy-handed to force the GA prefix on everyone. How hard is it to work around namespace conflicts in any case? In Python, it would be easy to sort out using "import X as Y" or whatever.

adamnovak · 2014-07-07T17:06:40Z

IMHO adding prefixes to the names when there is a namespace collision will
probably hinder code readability. It's relatively hard to grasp that what's
a Record everywhere else is a GARecord in this particular file, and it's
worse if some files use Record for our record and some other files in the
same project use Record for the conflicting record type.

On Mon, Jul 7, 2014 at 1:30 AM, Jerome Kelleher notifications@github.com
wrote:

Avoiding namespace collisions is important, but wouldn't this be
relatively rare? If the common case we expect is not to have conflicting
namespaces in end-user code, then it seems a bit heavy-handed to force the
GA prefix on everyone. How hard is it to work around namespace conflicts in
any case? In Python, it would be easy to sort out using "import X as Y" or
whatever.

—
Reply to this email directly or view it on GitHub
#87 (comment).

cassiedoll · 2014-07-07T17:15:58Z

Maybe I'm just too used to Google's style - we never prefix anything :)

pgrosu · 2014-07-07T17:26:19Z

IMHO collisions should rarely happen with well-written code. In any case, there is @namespace() if one would like specify the namespace. Just my $0.02 :)

fnothaft · 2014-07-07T17:35:07Z

IMHO collisions should rarely happen with well-written code. In any case, there is @namespace() if one would like specify the namespace. Just my $0.02 :)

@pgrosu conceptually, I agree. However, if we're exporting data into/out of the API, we may need to interoperate with other libraries that also use the *Record name (e.g., net.sf.samtools.SAMRecord, org.bdgenomics.format.avro.ADAMRecord).

pgrosu · 2014-07-07T18:50:01Z

Thanks @fnothaft for the clarification - now that makes more sense regarding the direction we want to head toward.

jeromekelleher · 2014-07-08T08:21:41Z

@fnothaft wouldn't it be better to fully qualify the class references in the (surely relatively rare?) cases when there is a namespace conflict, than to force all code to use a prefix? I think the common case should be as easy and intuitive as possible - I certainly found the GA prefixes confusing and unnecessary when I first looked at the API. Perhaps I'm misunderstanding what the common case will be though?

pd3 · 2014-07-08T08:34:29Z

@jeromekelleher +1

heuermh · 2014-07-16T16:41:43Z

For what it is worth, I am also in favor of this (although I don't have voting authority); see e.g. pull request #94.

dglazer · 2014-07-16T17:28:57Z

+1 -- I think the benefits of a more readable API for everyone outweigh the costs of occasionally, in some programming contexts, needing to do explicit namespacing.

benedictpaten · 2014-07-17T10:19:28Z

+1 -- agree benefits (far) outweigh the costs. This is not C, we have well
defined namespaces!

On Wed, Jul 16, 2014 at 6:28 PM, David Glazer notifications@github.com
wrote:

+1 -- I think the benefits of a more readable API for everyone outweigh
the costs of occasionally, in some programming contexts, needing to do
explicit namespacing.

—
Reply to this email directly or view it on GitHub
#87 (comment).

cassiedoll · 2014-08-20T17:18:31Z

We have no consensus here, so I'm closing this issue.
(We can always revisit it later if we like)

jeromekelleher · 2014-10-15T00:41:43Z

After discussing this with @fnothaft, @massie and @vadimzalunin it seems we might be a bit closer to consensus on this - perhaps we could have another vote? It would be good to discuss in person in San Diego if we are still divided on the topic.

I'm +1 for removing the prefix.

richarddurbin · 2014-10-15T10:54:38Z

I would be OK with removing the prefix if that is the consensus. So a cautious +1.

Richard

On 15 Oct 2014, at 01:41, Jerome Kelleher notifications@github.com wrote:

After discussing this with @fnothaft, @massie and @vadimzalunin it seems we might be a bit closer to consensus on this - perhaps we could have another vote? It would be good to discuss in person in San Diego if we are still divided on the topic.

I'm +1 for removing the prefix.

—
Reply to this email directly or view it on GitHub.

The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.

pgrosu · 2014-10-15T13:46:19Z

@jeromekelleher - Wait, you convinced @fnothaft and @massie regarding this? :) So if we have multiple libraries with the same object names, then how do we differentiate between them?

lh3 · 2014-10-15T13:54:54Z

I mildly prefer to keep the GA prefix when Avro itself does not have a concept of namespace (if I am right). When I read code, the prefix immediately tells me the source of the APIs, which to me is useful. Nonetheless, I don't mind if the consensus is to remove the prefix. -0

diekhans · 2014-10-15T14:00:43Z

Conceptually, I like removing the prefix. There is a concerned
about how this will impact code generation for languages that
don't support namespaces, such as C.

mbaudis · 2014-10-15T14:12:14Z

There is some advantage for keeping it in metadata, where the GA will unanimously point to our interpretation of value range in this context (see the lengthy GAsex related exchanges)

I understand that this is different for reads/file formats, where the contributions are meant even more general (GA or not).

Michael

On 15 Oct 2014, at 15:54, Heng Li notifications@github.com wrote:

I mildly prefer to keep the GA prefix when Avro itself does not have a concept of namespace (if I am right). When I read code, the prefix immediately tells me the source of the APIs, which to me is useful. Nonetheless, I don't mind if the consensus is to remove the prefix. -0

—
Reply to this email directly or view it on GitHub.

fnothaft · 2014-10-15T14:23:34Z

+1

@pgrosu

@jeromekelleher - Wait, you convinced @fnothaft and @massie regarding this? :) So if we have multiple libraries with the same object names, then how do we differentiate between them?

We've de-prefixed ADAM over the past few months: bigdatagenomics/adam#418, bigdatagenomics/adam#342...

delagoya · 2014-10-15T19:05:10Z

+1 for removing most prefixes, but I do agree that some of the metadata records can stand to keep it so we stay out of developing and clashing with Ontologies.

vadimzalunin · 2014-10-18T18:16:15Z

0.1 and 0.5 proved prefixes less useful the I anticipated before, so changing mine -1 to +0.

jeromekelleher · 2014-10-22T09:10:28Z

It seems that a lot of people are in favour of removing the GA prefix, and no one is strongly against the idea. The prefix is certainly useful in some situations, but I think that our overriding concern should be for end-users writing client code. In this case, we need to ensure that the API is as easy as possible to learn and understand, and the GA prefix is a major obstacle to this in my view.

@mbaudis and @delagoya - I don't quite follow why metadata is different. Why is org.ga4gh.GASex better than org.ga4gh.Sex (for example)?

pgrosu · 2014-10-22T10:57:11Z

@jeromekelleher, I think it comes down to the idea that if a set of all possible attributes for a class in GA4GH becomes a superset - and its has specific properties in the way it was defined - where the majority of users might only use a subset of the same class. Then the superset sort of becomes its own class.

I think this discussion ( https://github.com/cassiedoll/schemas/commit/51cd6ece5c83fc8076959a3be997190303718170#commitcomment-7663742 ) - as one of several that @mbaudis alluded to - helps to illustrate that. I think with some classes, we're working on first defining the class, attributes and their interpretation (including their relation to other classes), in order to collect all the variations under those definitions, which strives to behave as sets of controlled vocabularies.

mbaudis · 2014-10-22T12:27:49Z

@pgrosu Well said. While the domain points to the technical definition of the attribute, the prefix makes it more obvious that the contained values may need some consideration.

That said, the "GA" here is of a more psychological nature and technically not necessary if the context is provided through the org.ga4gh domain. I am not against removing it; there always will be trade-offs ... And we still can change the names of the attributes (org.ga4gh.geneticSexAsWeSeeIt) ;-)

delagoya · 2014-10-22T13:46:19Z

+1
Things that can be misunderstood as "GA4GH is developing their own Ontology" should be prefixed to make it clear that this we are not attempting to do so.

On Oct 22, 2014, at 3:57 AM, Paul Grosu notifications@github.com wrote:

@jeromekelleher, I think it comes down to the idea that if a set of all possible attributes for a class in GA4GH becomes a superset - and its has specific properties in the way it was defined - where the majority of users might only use a subset of the same class. Then the superset sort of becomes its own class.

I think this discussion ( cassiedoll@51cd6ec#commitcomment-7663742 ) - as one of several that @mbaudis alluded to - helps to illustrate that. I think with some classes, we're working on first defining the class, attributes and their interpretation (including their relation to other classes), in order to collect all the variations under those definitions, which strives to behave as sets of controlled vocabularies.

—
Reply to this email directly or view it on GitHub.

pgrosu · 2014-10-22T20:25:55Z

+1, @mbaudis, Thank you and I agree :)

+1, @delagoya, I think we should be fine as a long as we mention that the focus of GA4GH is to enable a standard for sharing sequencing and associated clinical information (meta)data, which requires accommodating a large number of possibilities.

jeromekelleher · 2014-10-23T08:58:37Z

@mbaudis, @delagoya, OK, this seems sensible. Perhaps we should make a PR without all the GAs and see how things look? We can add some back if it looks like we are trying to be too all-encompassing with the definitions.

This would be a pretty disruptive change, so perhaps we should wait until some of the outstanding PRs are merged?

cassiedoll · 2014-10-28T21:15:42Z

Let's definitely make a PR - it looks like everybody is now on board. I think we're always going to have conflicting PRs, so we don't have to really wait. Whoever has free time can pick this up :)

adamnovak · 2014-11-05T18:30:52Z

Closing this since a fix has been merged.

Removed duplicate test, changed behavior of another to look for empty list in response.

Fix for ga4gh#87

cassiedoll added ReadTaskTeam labels Jul 2, 2014

jeromekelleher mentioned this issue Jul 10, 2014

Remove GA prefix from type names #94

Closed

cassiedoll mentioned this issue Jul 15, 2014

Add missing GA prefix to variant methods #98

Merged

cassiedoll added this to the vNext milestone Jul 23, 2014

cassiedoll closed this as completed Aug 20, 2014

jeromekelleher reopened this Oct 15, 2014

jeromekelleher mentioned this issue Oct 30, 2014

Removed GA prefix globally. #172

Merged

adamnovak closed this as completed Nov 5, 2014

dcolligan pushed a commit to dcolligan/ga4gh-schemas that referenced this issue Jul 20, 2016

Fix for ga4gh#87.

4634ddb

Removed duplicate test, changed behavior of another to look for empty list in response.

dcolligan pushed a commit to dcolligan/ga4gh-schemas that referenced this issue Jul 20, 2016

Merge pull request ga4gh#93 from hjellinek/compliance_redux

18698a3

Fix for ga4gh#87

Do we need the GA prefix on all objects? #87

Do we need the GA prefix on all objects? #87

Comments

cassiedoll commented Jul 2, 2014

massie commented Jul 2, 2014

fnothaft commented Jul 2, 2014

cassiedoll commented Jul 2, 2014

fnothaft commented Jul 2, 2014

cassiedoll commented Jul 2, 2014

fnothaft commented Jul 2, 2014

cassiedoll commented Jul 2, 2014

vadimzalunin commented Jul 3, 2014

jeromekelleher commented Jul 7, 2014

adamnovak commented Jul 7, 2014

cassiedoll commented Jul 7, 2014

pgrosu commented Jul 7, 2014

fnothaft commented Jul 7, 2014

pgrosu commented Jul 7, 2014

jeromekelleher commented Jul 8, 2014

pd3 commented Jul 8, 2014

heuermh commented Jul 16, 2014

dglazer commented Jul 16, 2014

benedictpaten commented Jul 17, 2014

cassiedoll commented Aug 20, 2014

jeromekelleher commented Oct 15, 2014

richarddurbin commented Oct 15, 2014

pgrosu commented Oct 15, 2014

lh3 commented Oct 15, 2014

diekhans commented Oct 15, 2014

mbaudis commented Oct 15, 2014

fnothaft commented Oct 15, 2014

delagoya commented Oct 15, 2014

vadimzalunin commented Oct 18, 2014

jeromekelleher commented Oct 22, 2014

pgrosu commented Oct 22, 2014

mbaudis commented Oct 22, 2014

delagoya commented Oct 22, 2014

pgrosu commented Oct 22, 2014

jeromekelleher commented Oct 23, 2014

cassiedoll commented Oct 28, 2014

adamnovak commented Nov 5, 2014