Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly declare relations between 'elements', 'nelements' and 'elements_ratios' #349

Merged

Conversation

merkys
Copy link
Member

@merkys merkys commented Jan 4, 2021

Reading the specification once more I find the relations between elements, nelements and elements_ratios lacking explicit declaration:

  1. Lists of elements and elements_ratios are correlated;
  2. nelements is equal to lengths of elements and elements_ratios.

These relations can be derived from the examples, or the general knowledge, but I believe it would be clearer to have them written explicitly.

@merkys merkys added topic/property-standardization The specification of the precise data representation of properties and entries type/minor-fix Obvious fix (e.g. typos, formatting, ...). Does not require broad discussion. PR/ready-for-review Add this flag if you are the author of the PR and you want it to be reviewed. Remove it when editing labels Jan 4, 2021
ml-evs
ml-evs previously approved these changes Jan 5, 2021
Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @merkys, I think this is a useful clarification.

@ml-evs
Copy link
Member

ml-evs commented Feb 22, 2021

Did anyone else want to take a look at this before 1.0.1? @rartino/@CasperWA?

Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  • I am not completely comfortable with the "correlated" sentences, as the "correlation" is not defined.
  • I think we should make clear that the MUST requirement between properties are only valid as long as both properties are given, i.e., not null.
    This could also be done via an introductory sentence at the top of the section, stating something like "all inter-property requirements are only valid as long as both properties are provided, i.e., neither have an unknown value"?

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
@ml-evs ml-evs added this to the 1.0.1 release milestone Feb 26, 2021
@merkys
Copy link
Member Author

merkys commented Feb 26, 2021

  • I am not completely comfortable with the "correlated" sentences, as the "correlation" is not defined.

As @ml-evs wrote, term "correlation" was already in use in the specification of v1.0.0. In addition to use in section for filtering, it is also mentioned in Structures entries section to emphasize such relation between various lists.

The situation could be possibly improved by describing "correlated lists" in the definition of terms section. Maybe it could also cite Wikipedia page on parallel array which describes the idea pretty well IMO. Maybe we could even switch to use "parallel lists" if this term is more intuitive/used more widely.

  • I think we should make clear that the MUST requirement between properties are only valid as long as both properties are given, i.e., not null.
    This could also be done via an introductory sentence at the top of the section, stating something like "all inter-property requirements are only valid as long as both properties are provided, i.e., neither have an unknown value"?

I agree, this would bring some more clarity to the specification. Of course, first we have to make sure whether this is a truth global to the specification :)

Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So nelements only has to be equal length but not refer to the other fields with the same order and elements and elements_ratios does not have to have equal length to the others? 😅

@merkys
Copy link
Member Author

merkys commented Mar 3, 2021

So nelements only has to be equal length but not refer to the other fields with the same order and elements and elements_ratios does not have to have equal length to the others? sweat_smile

Oops, will change this as well!

rartino
rartino previously approved these changes Mar 16, 2021
Copy link
Contributor

@rartino rartino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this version; this way of expressing the "correlation" criteria is fully clear to me.

@CasperWA
Copy link
Member

So nelements only has to be equal length but not refer to the other fields with the same order and elements and elements_ratios does not have to have equal length to the others? sweat_smile

Oops, will change this as well!

Did you implement these changes yet? I don't think I've seen it yet. I can do it though? :)

@merkys
Copy link
Member Author

merkys commented Mar 17, 2021

So nelements only has to be equal length but not refer to the other fields with the same order and elements and elements_ratios does not have to have equal length to the others? sweat_smile

Oops, will change this as well!

Did you implement these changes yet? I don't think I've seen it yet. I can do it though? :)

I just fixed them. Sorry for the delay.

@merkys merkys requested review from rartino and CasperWA March 17, 2021 06:20
@merkys merkys requested a review from ml-evs March 17, 2021 06:20
optimade.rst Outdated
@@ -1821,6 +1821,7 @@ elements
- **Query**: MUST be a queryable property with support for all mandatory filter features.
- The strings are the chemical symbols, i.e., either a single uppercase letter or an uppercase letter followed by a number of lowercase letters.
- The order MUST be alphabetical.
- MUST be of equal length as `elements\_ratios` and refer to the elements in the same order as `elements\_ratios`_, if the latter is provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for doing this... Would this be shorter and even more clear (?) written as this:

Suggested change
- MUST be of equal length as `elements\_ratios` and refer to the elements in the same order as `elements\_ratios`_, if the latter is provided.
- MUST refer to the same elements in the same order as `elements\_ratios`, if the latter is provided.

Copy link
Member

@CasperWA CasperWA Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is close to what he had before (i.e., before 263a303), where I asked him to add the length part as well, simply to be as explicit as possible. However, I guess it could work this way as well when same has been added. I am OK with either, but would personally prefer the more explicit edition of the two here - so I'd leave this up to @merkys to choose your personal favourite :)

Copy link
Contributor

@rartino rartino Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... @CasperWA, I think you miss the formal difference of the addition of an extra "same" in the sentence: "MUST refer to the elements in the same order as elements_ratios" vs. "MUST refer to the same elements in the same order as elements_ratios".

I'm arguing that latter implies that the lengths must be equal and is possibly even stronger. (Just saying that the lengths must be equal and the order the same could allow referring to the same element twice and leaving another one out...)

optimade.rst Outdated Show resolved Hide resolved
@merkys merkys requested a review from rartino May 6, 2021 13:58
Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just fixed the conflict that I added inadvertently in #358, and I am happy with this wording.

@shyamd
Copy link

shyamd commented Jun 7, 2021

I'll throw a wrench into the discussion.
nelements should equal len(elements), this is fine.
In an experiment with known impurities, a DB might want to list those impurities but not include them in the element_ratios, so then the list length would not match between elements and element_ratios

@ml-evs
Copy link
Member

ml-evs commented Jun 7, 2021

I'll throw a wrench into the discussion.
nelements should equal len(elements), this is fine.
In an experiment with known impurities, a DB might want to list those impurities but not include them in the element_ratios, so then the list length would not match between elements and element_ratios

This is a good point that could do with some clarification (I would hope in another PR ;)). The statistical aspects of compositions are handled by species and species_at_sites, but there is currently no linkage from species_at_sites to elements, e.g. we do not require that set(symbol for species in entry.attributes.species for symbol in species.chemical_symbols) == entry.attributes.elements. Of course, impurity could mean 10% or it could mean 0.00001%...

I would suggest that given the existing interpretation (or lack thereof) of the specification, we should still merge this pending further discussion. This requirement has been implicit (and is coded up in the python-tools models, for example); if we decide to relax it based on this impurity use-case then that will not be a breaking change.

Copy link
Member

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in. It's been discussed extensively and I agree with the current edition. Cheers @merkys ! :)

Copy link
Contributor

@rartino rartino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, didn't we merge this yet? I approve the present version.

@ml-evs ml-evs merged commit e016ff9 into Materials-Consortia:develop Jun 9, 2021
@merkys merkys deleted the element-properties-correlation branch June 10, 2021 04:15
@merkys
Copy link
Member Author

merkys commented Jun 10, 2021

Thanks for approving and merging this!

@ml-evs
Copy link
Member

ml-evs commented Jun 10, 2021

I have just opened #361 to carry on @shyamd's discussion point

rartino added a commit that referenced this pull request Jul 7, 2021
…ments_ratios' (#349)

* Declaring relations between 'elements', 'nelements' and 'elements_ratios'.

* Update optimade.rst

Co-authored-by: Casper Welzel Andersen <43357585+CasperWA@users.noreply.github.com>

* elements_ratios: replacing "correlated" with more intuitive formulation.

* Describing the relation between `elements` and `elements_ratios`.

* Update optimade.rst

Co-authored-by: Rickard Armiento <gitcommits@armiento.net>

* Describing the relation between `elements` and `elements_ratios`.

* Explicitly listing the requirements on both length and order.

* Update optimade.rst

Co-authored-by: Rickard Armiento <gitcommits@armiento.net>

* Rewording according to @rartino's suggestion.

* Reverting accidentally modified definition of `nelements`.

Co-authored-by: Casper Welzel Andersen <43357585+CasperWA@users.noreply.github.com>
Co-authored-by: Rickard Armiento <gitcommits@armiento.net>
Co-authored-by: Matthew Evans <7916000+ml-evs@users.noreply.github.com>
@rartino rartino mentioned this pull request Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR/ready-for-review Add this flag if you are the author of the PR and you want it to be reviewed. Remove it when editing topic/property-standardization The specification of the precise data representation of properties and entries type/minor-fix Obvious fix (e.g. typos, formatting, ...). Does not require broad discussion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants