Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to explicitly set the attribute "Conventions" #2457

Closed
wants to merge 6 commits into from

Conversation

morwenna-01
Copy link
Contributor

iris.save, by default, makes CF-1.5 netcdf files.
I need CF-1.6 files. I couldn't find an option to do this, so I have made the following changes to allow it.
I'm new at this, and very happy to learn, so please help me.

@pelson
Copy link
Member

pelson commented Mar 23, 2017

Ping @bjlittle - looks reasonable to me. 👍

@pelson
Copy link
Member

pelson commented Mar 23, 2017

Incidentally, should you be eager to work around the version that is written by default, the alternative would be to:

import iris.fileformats.netcdf
iris.fileformats.netcdf.CF_CONVENTIONS_VERSION = 'CF-1.6'

Before doing an iris.save.

@morwenna-01
Copy link
Contributor Author

Phil, your solution works fine. I should have asked you first!
The only advantage of mine is that it looks to see what the "Conventions" attribute is already.
thanks.

@pelson
Copy link
Member

pelson commented Mar 23, 2017

I like your approach because it preserves the conventions from the input. Let's get some feedback from others, but in principle I'm 👍 with preserving the Conventions info.

@bjlittle bjlittle self-assigned this Mar 23, 2017
@bjlittle
Copy link
Member

Hi @morwenna-01! Awesome, and thanks for taking the time to push this PR! It's very much appreciated 😄

There's a little bit of impact on the CI tests due to your changes. So, rather that talk you through the fixes that you need to make, what might be easier is for me to push a PR with those recommended changes onto your CF-1.6 branch. We can then discuss my suggested changes there, and if you're in agreement and happy with my suggestions, merge them on your branch and that'll help get this PR across the line and merged.

I'll do that now ...

@bjlittle
Copy link
Member

The only contentious side-effect of this PR is that by using the existing Conventions specified in the cube.attributes is that the generated netCDF file may now not be compliant to that standard i.e. a cube that has a Conventions = "CF-1.7" is saved to file but the actual generated netCDF output is definitely not compliant to that standard.

We're not going to actually compliance check the netCDF output generated by iris to ensure that it matches the user (input) specified Conventions ... but issuing a warning to state that the default iris Conventions is being clobbered by the cube version of Conventions may be the happy compromise.

Other @SciTools/iris-devs may have an opinion on this ...

@rhattersley
Copy link
Member

issuing a warning to state that the default iris Conventions is being clobbered by the cube version of Conventions may be the happy compromise.

Aren't warnings ideally avoidable? i.e. A warning is an indication that you're doing something the "wrong" way, and that you should modify your code to acheive the same end via some other "correct" route? What would the "correct" alternative be in this case?

@bjlittle
Copy link
Member

@rhattersley Is that a loaded question? 😜

@rhattersley
Copy link
Member

No, that wasn't the idea! Sorry if it came across as such! I'm conscious that I've not kept up to date with changes for 9+ months, so I genuinely don't know all the current/planned options!

@bjlittle
Copy link
Member

@rhattersley Just teasing ... I thought that you had an ace up your sleeve, as to the "correct" alternative.

I agree that issuing a warning smacks of doing something "wrong". The user can always circumvent the whole issue by clobbering the ...

iris.fileformats.netcdf.CF_CONVENTIONS_VERSION = 'CF-1.6'

... and then do the save as @pelson suggests.

Perhaps, there is no "correct" alternative ... I certainly can't immediately think of one other than not to adopt this PR and dodge the whole thorny issue.

@rhattersley
Copy link
Member

It's probably worth thinking about how this interface will evolve in the longer term to encompass 1.6, 1.7, ..., 2.0, etc. That might suggest what steps to take in the medium term.

But in the meantime, a couple of options come to mind (by no means intended to be exhaustive):

  1. Stick with this PR, but just don't bother with the warning.
  2. Return the user something that allows them to fiddle with things after the fact.

@morwenna-01
Copy link
Contributor Author

@bjlittle Thanks for looking at this. I can't see where you have made suggestions for me to look at.

@ajdawson
Copy link
Member

It would be nice to avoid unknowingly writing files that tell lies, if we can't guarantee we are writing CF-X then it should be an explicit decision by the user to write conventions: CF-X in an output file.

How do we end up with a conventions attribute on a cube? Do we just read conventions from file if it is there, do we add one automatically, or does it have to be put there by a user?

If merely having a conventions attribute on a cube is considered explicit enough, then we should accept this PR without the warning. If however this can cause unsuspecting users to write invalid metadata, then perhaps we should make it more explict, something like @pelson suggested or similar (but documented) could be more appropriate.

@bjlittle
Copy link
Member

bjlittle commented Mar 23, 2017

@morwenna-01 Apologies, that's because I've not done it yet ... keep getting distracted. Let me do that now (honest) 😉

Hang tight ...

@bjlittle
Copy link
Member

@morwenna-01 See morwenna-01#1

@bjlittle
Copy link
Member

bjlittle commented Mar 23, 2017

@ajdawson

How do we end up with a conventions attribute on a cube? Do we just read conventions from file if it is there, do we add one automatically, or does it have to be put there by a user?

Conventions typically come from a netCDF file as a global attribute, which we load and translate into the Cube.attributes dictionary. So, what the user sees after loading is a fair representation of that metadata in the netCDF file. At present, on saving to netCDF we always write out the default iris.fileformats.netcdf.CF_CONVENTIONS_VERSION as the Conventions global attribute in the target netCDF file - so we add a Conventions if it's not there in the cube, and we clobber it if it does exist in the cube ... either way the user has lost control.

@morwenna-01 PR addresses this, giving control back to the user. If they want the iris default, then they simply nuke the existing Conventions attribute in the cube and iris will save with the default. Otherwise, now an existing cube Conventions attribute will persist through to the netCDF file.

The upshot is that allowing the loaded "user" Conventions to persist down to netCDF is that it might misrepresent the actual compliance of the netCDF file. But at least they have a choice with this PR, whereas at present they don't. Does that help clarify?

@ajdawson
Copy link
Member

It does help, thanks. The part I'm thinking about is where a user loads from a file that contains a global conventions attribute without really knowing this, then they manipulate the cube, and then save back to netcdf. Those users may be overriding the conventions in their output file without realising, since iris is using a particular set of CF conventions that may differ from those in the input file. In this case they cannot be rescued by the default unless they know to go and remove the conventions attribute from their cube before saving. This effectively passes the explicit portion of the process onto users that do not want to override the conventions attribute, rather than those who do, which should ideally be the other way around.

To be clear, this is a minor issue to me, and I'm definitely in favour of passing control of the conventions attribute to the user as an option somehow.

@bjlittle
Copy link
Member

bjlittle commented Mar 23, 2017

@ajdawson I agree that having an option to control what happens is ideal, with a suitable default behaviour. Like you said, flip the logic to so that by default the Conventions are clobbered with the iris default Conventions, and the user explicitly opt to override this behaviour to use the Cube.attributes['Conventions'].

Seems appropriate to me.

It is a minor issue, but a pain point which I'm keen to resolve.

BTW your new gravatar is freakin' me out!

@marqh
Copy link
Member

marqh commented Mar 23, 2017

It's probably worth thinking about how this interface will evolve in the longer term to encompass 1.6, 1.7, ..., 2.0, etc. That might suggest what steps to take in the medium term.

But in the meantime, a couple of options come to mind (by no means intended to be exhaustive):

Stick with this PR, but just don't bother with the warning.
Return the user something that allows them to fiddle with things after the fact.

+1 for don't bother with the warning

in my view, this capability is asserting that the file written out is valid CFX.X
it is the user's responsibility to manage that assertion and its implications, not the software library's role to enforce this

@ajdawson
Copy link
Member

@bjlittle - I think I can sum up my feelings by first saying that overriding the default is an "exceptional" action and that in order to do this, one should have to do a little more work. For this reason I'd lean towards setting CF_CONVENTIONS_VERSION explicitly, and documenting it as a feature (or something along similar lines*). This type of approach the right effort to understanding-of-what-is-about-to-happen ratio for my taste, it isn't a lot of work to use this structure in a program, and it is clear as day what is happening.

BTW your new gravatar is freakin' me out!

My old avatar had me too young and fresh looking, this new one more accurately reflects my current state of well-being 😏

*although I can't think of anything useful right now

@marqh
Copy link
Member

marqh commented Mar 23, 2017

i am pondering

If I load a CF1.x file, then save it 'unchanged', what behaviour should I expect?

  • currently I always get CF1.5
  • in future, what do I expect to happen?
  • I get CF1.x in my result file
  • I get CF1.5 unless I explicitly set this myself, after loading the file?

I think having a default output version of CF output is not a bad pattern and that this override method seems sound. I am just fretting that loading conventions from a CF-netCDF file into the cube and always using it represents a behaviour change. I think I prefer it, but it is a change. Maybe it should be part of the next major version?

@bjlittle
Copy link
Member

bjlittle commented Mar 24, 2017

@SciTools/iris-devs Hmmm could I tempt anyone with the following approach to controlling options without polluting the public API with kwargs ...

with iris.options.netcdf(conventions_override=True):
    iris.save(cube, 'wibble.nc')

and also supporting the following pattern:

iris.options.netcdf(conventions_override=True)
iris.save(cube, 'wibble.nc')

So with @morwenna-01 concrete example in mind, netCDF saving is not changed i.e. iris will clobber the Conventions with its default value (as usual). However, the user can explicitly "opt out" of that behaviour using iris.options either as a context manager or globally setting that state. I'm not too bothered at the moment whether it's iris.options(netcdf_conventions_override=True) or iris.options.netcdf(conventions_override=True) ... I'm more interested in this capability of finer grained "opt in/out" control. I can see this approach being pretty useful very soon to control the behaviour of dask i.e. the default chunking strategy, number of workers/threads/processes, the choice of scheduler etc etc

The attraction to the iris.options.netcdf pattern is that options live in a relevant category or namespace synonymous with the functionality/behaviour that is being controlling. So that lends itself neatly to iris.options.pp, iris.options.regrid or whatever we deem relevant ... but as long as we're pretty rigorous about not bloating iris.options.

Thoughts ...

@bjlittle
Copy link
Member

bjlittle commented Mar 24, 2017

@morwenna-01 Apologies, but it's really not normal of a PR to attract such gratuitous commentary!

However, you've touched upon a really interesting and relevant point, and it's really great that there's lots of lively debate 😄

@ocefpaf
Copy link
Member

ocefpaf commented Mar 24, 2017

It is good that this is getting some attention. I closed #1769 because I thought there was no interest in implementing this.

My hack to keep the original convention from files I read is very similar to this PR, so I am 👍

PS: There are some elements of CF-1.6 in iris already, both Ocean s-coordinate, generic form 1 and
Ocean s-coordinate, generic form 2 for example, so getting the ability to save as CF-1.6 is useful to let automate tools know that the file is in fact CF-1.6 and not CF-1.5 and that the right dimensionless coordinate does exists.

@bjlittle
Copy link
Member

@morwenna-01 Okay ... so #2467 was merged on master after my initial comment to handle iris options in a more generic way.

As a result you will need to rebase you branch against the latest master branch of iris ... do you know how to do that? If not, I could give you a hand ... just let me know! Glad to help!

@morwenna-01
Copy link
Contributor Author

@bjlittle I've done a rebase, but it then failed one of the tests written by @dkillick
However my reading of #2467 was that it included my suggestion (and a lot more!) and therefore my pull request would be refused.

@morwenna-01
Copy link
Contributor Author

@bjlittle Can you help me? What should I do now? I can work to make to PR pass the tests that it failed, but I'm not sure that this PR is still relevant. thanks.

@marqh marqh added this to the v1.13.0 milestone May 11, 2017
@corinnebosley
Copy link
Member

Hello @morwenna-01, thanks for sticking with this PR, it is very much still relevant and I would like to get this to a point at which we can merge it by the middle of next week. There are a few things that need attention before we get to that point though:

  1. I notice that there has been a label added to this ticket (Pending CLA) indicating that you need to sign a Contribution License Agreement. You can find this in the form of a PDF in the SciTools documentation here: http://scitools.org.uk/governance.html#contributors

Once you have found this, you can sign it (either electronically or with a printer/scanner and an actual pen) and send it back to us at this email address: cla@metoffice.gov.uk

  1. There is one failing test still outstanding which will need a bit of thought to work out why it is failing and what this means for the PR (does the PR need tweaking, or does this failure mode indicate a genuine problem with the proposed change?).

@marqh
Copy link
Member

marqh commented May 16, 2017

in reviewing the what's new, I have just revisited
#2467

please may a reviewer compare that PR to this to ensure we are delivering functionality consistently

e.g.
https://github.com/SciTools/iris/pull/2467/files#diff-36cf2602a7c220f674c997618f233af7R173

i'm read this twice and I am not sure (sorry :/

@ajdawson
Copy link
Member

I can't see anything functionality this PR adds that is not already in #2467. The two are differing approaches to the same problem. @morwenna-01 - can you verify that #2467 meets the needs of your use case?

@morwenna-01
Copy link
Contributor Author

@ajdawson @marqh - As I commented 29 days ago, I'm not sure I totally understand #2467, but I think that it covers my PR, plus probably a lot more. However, then @corinnebosley suggested that I stick with this PR (4 days ago). That's why I started playing with it again. As you can see it fails some of the (new) tests written by @dkillick. It would be great if someone who understands the code better than me (that would be all of you!) could make a final decision as to whether I should try and work out how to make this PR pass the tests or whether this PR should be dropped.
thanks (please note that this is the first time I've ever done this, I'm learning!)

@ajdawson
Copy link
Member

Thanks for your efforts @morwenna-01 and apologies for the confusion. To be clear, this PR is no longer required, it's functionality is implemented in #2467. I wanted to offer you the opportunity to run the code in #2467 to verify it meets your needs as a user before closing this. If you don't want to do that, then no problem, this PR can be closed immediately. Thanks again.

@lbdreyer lbdreyer removed this from the v1.13.0 milestone May 17, 2017
@pelson
Copy link
Member

pelson commented Jun 10, 2017

I think we can close this now. Thanks for your contribution @morwenna-01 - and congratulations again on getting this far! 🎉
Looking forward to reviewing more issues / merge requests as they come along 😄

@pelson pelson closed this Jun 10, 2017
@morwenna-01 morwenna-01 deleted the CF1.6 branch June 13, 2017 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants