-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification of time coordinate requirements #166
Comments
Hi Martin, There is no reference to David |
Hi David, I think we first need to decide whether a coordinate variable with standard name There is a related problem, in that it is also impossible to use For instance, the following CDL generates an invalid file,
The error reported is that This is, however, a side issue to the main topic I wanted to discuss here, which is the ambiguity regarding |
Hi Martin, Perhaps as I am not a consumer of such forecast data, I can't initially see the benefit in making a coordinate variable that contains time deltas a CF "time" coordinate variable, and the extra complication that that introduces. I would be great to see some use cases where this labelling provides benefit. I think (?) that there are various climate indices that have units of time deltas (e.g. relating to growing season length ...) that we would not want to automatically call "time" coordinates. What about a coordinate variable containing latitude deltas - would we also want to make that an "X" variable - perhaps not? With regards the cfunits, it does indeed think that X is equivalent to X-1, for any units X (e.g. metres, or seconds, or ...) and you are right in that this is purely inherited behaviour from the UDUNITS C library. However, I also think that this behaviour is wrong in a data analysis context (e.g. it allows cf-python to subtract data in units of "s" from data in units of "s-1") and I shall treat it like as a bug in cfunits. Many thanks, |
Hi David, The specific use case that motivated me was the metadata standard used by the WCRP Climate system Historical Forecast Project (CHFP). They use Re cfunits: thanks. This will affect CMIP6 data, so it would be good to have this fixed. Note that the bug also affects height, e.g. In the case of time there is an additional ambiguity which can be illustrated by this question: is "day" a unit of time? Most people would say yes, I think, because the word "time" is usually associated with time intervals. In your comment above you are drawing a distinction between "time deltas" (for elapsed time) and "time", which you interpret as referring to a specific instance in time. Udunits is, I think, reflecting common usage when it considers both I agree that we need consistency with spatial coordinates. Again, I think a latitude delta used as a coordinate would qualify as a spatial coordinate, and if we want to restrict the use of, for example, regards, |
Satellite swath data presents another use case for a relative time coordinate. There is often an absolute time (or time relative to a distant epoch) associated with a given scan line, often referenced to the center pixel of the scan. (The scan line is approximately transverse to the direction of motion of the satellite.) Each pixel of the scan line can be given a time relative to that center time, and many satellites provide the data that way. While it is possible to convert all of those relative times to absolute times and store them in that fashion, it will likely take up quite a bit more space, and there is a good chance that users of the data will convert back to relative times as their first step after reading the data. In this particular use case, the relative times correlate to a transverse axis to the direction of travel axis. You can think of them as the Y to the center scan time X. All that to say, relative time and space coordinates can be quite useful. I can imagine that the concept of relative coordinates might be generally useful beyond the spatio-temporal domain. I think we should make room for them. It would also be necessary to connect those relative coordinates to absolute coordinates, so that you could indicate that the forecast times or sample times were relative to absolute time coordinate values and do similarly for other coordinates. As a quick digression, I'm using the word "absolute" here to refer to a coordinate variable that is relative to a fixed reference or epoch. So frequency, time since an epoch, X distance from a coordinate system zero point, temperature in Celsius, and depth below mean sea level would all be considered absolute. In order to connect a relative coordinate to an absolute coordinate, we could indicate the relationship by naming the reference variable in an attribute on the relative variable. The exact relationship between the coordinate variables and a data variable would be seen through the dimensions. The variables in the examples below are The union of the named dimensions of an absolute coordinate variable and a relative coordinate variable must be a subset of (or the same as) the named dimensions of the data variable they relate to. General examples:
Simple examples:
If we like, we could preclude the use of scalar coordinate variables. Any thoughts on such an addition? |
Thanks @JimBiardCics , that is very useful. I have come across this before in a discussion about swath data .. people struggling to find a CF encoding data which they usually analyse in terms of a scan start time and a scan offset time. As you say, collapsing it to a single dimension is possible, but there are good reasons why that is not common practice in the community. To support it, I think we would need a new standard name for the relative time: do you agree? @davidhassell remarked about dealing with a coordinate variable containing latitude offsets (deltas). I replied that we would probably want to have consistency in the way we treated time, space and other coordinates: I have since realised that we have some quite complex baggage in this area which introduces some differences in approach that we may have to live with. In particular, for Although there are differences, the underlying concept is similar to that described by @JimBiardCics : if a time offset is used as a coordinate it is common practice to provide the information needed to construct the absolute time in another variable, which may have different dimensions to those of the time offset variable. In the weather forecast community it is the norm to distinguish between a "validation time", which is intended to represent the time at which the corresponding data is (most) valid, and the "forecast reference time", which is also an absolute time, but corresponds to the start of the forecast rather. Because of long usage, validation time in CF is represented by For most variables, the question as to whether it is absolute or relative has no relevance to their use as coordinate variables: the only requirements, I believe are monotonicity and absence of missing data. Is there a good reason for making spatial and temporal coordinate different? We clearly need some guidance on how to present absolute time (i.e. chronological time) accurately, but do we need rules restricting the use of time offsets? |
@JimBiardCics In your scan line example, how would that be encoded? Is the time of the centre pixel stored as a size-1 coordinate variable and the offsets stored in a size-N coordinate variable? Are the two connected by just convention? Whilst we can clearly have a I'm wondering if dimension coordinates constructs can, or should, be characterised as belonging to a coordinate reference system, and therefore have to have datum for their values, either explicitly or implicitly defined. In the case of It seems to me that |
@martinjuckes I think a unified approach (to the degree possible) is the best plan. Even though I used them in my previous comment the terms relative and absolute may not be the best to use, since so many measurements are values relative to some reference point or other. I think the differentiator here is whether or not the reference is (to first order) static or in some fashion unchanging. As it stands, CF doesn't deal with static reference points in a consistent fashion. The reference point is implied by the units in some cases (Celsius and Fahrenheit, for example), stated within the units string when it is for time, stated in a We can continue to deal with this on a case-by-case basis through standard name definitions, or we could handle some of the cases by recognizing a class of "subordinate coordinates" or "offset coordinates" (or whatever name you prefer) and come up with rules for them. This would provide a mechanism for cases such as forecast offsets and scan times where there is a changing reference point. These are both time-based cases, but similar situations arise with cases such as changing orientation angles of an instrument relative to a platform which itself has changing orientations angles. This particular relationship is generally thought of as implied, but we could provide a way to make such relationships clear. |
Can this be covered by
This would still require some case-by-case specifications, since the formula associated with any @davidhassell : I can understand that you might want to know the reference point for time specified through a coordinate variable that is an elapsed time, but the question is whether we want to have specific rule for time in this regard, and if so, what is it? As @JimBiardCics has pointed out, we have a huge variety of terms in CF and in general it is going to up to the user whether they provide a reference value or not. I feel it would be enough to recommend that |
@davidhassell In the scan line example, the reference time for each scan (might be the center time) is stored in a regular time variable as an "absolute" time. Let's call this variable scan_time. There are two use cases, one where the sample times for each scan are variable, and one where they are considered fixed. In the fixed case the sample_time variable is one-dimensional, with the same dimension as the transverse coordinate dimension (whatever that is) of the data variable. This corresponds to simple example 3 in my earlier comment. The time values in the sample_time variable are relative to each value of the scan_time variable in turn. In the variable case the sample time variable is two-dimensional. The first dimension is the same as the scan_time dimension and the second dimension is the same as the transverse coordinate dimension of the data variable. This is simple example 2. In both cases the sample_time coordinate variable is a valid coordinate for the data variable, but there is a relationship between the scan_time coordinate and the sample_time coordinate that ought to be captured. Furthermore, the scan_time coordinate cannot, according to CF Section 4.4, be considered a time variable. |
@martinjuckes I tried that approach in another instance in the past (perhaps about this very thing?) and was told that formula_terms are only for use with parametric vertical coordinates. |
It is certainly true that |
@martinjuckes Thanks for pointing me to this issue. As you know my main experience is with climate indices (aka derived statistics), and less so with forecast data. As @davidhassell already mentioned in this thread there are climate indices, like the growing season length, which is a duration or "time delta" that do not fit (I think) to the current context as it is unique to each location and year. However, it is simply the difference between the end of growing season and beginning of growing season, which both are durations relative to some reference time. This reference time could either be common for all years (e.g. it could the same as for the |
There is also a potential use-case in CMIP climate simulations: the pre-industrial control simulations have an arbitrary model time which, in the CMIP5 archive, varies between 0000 and 2500 (which causes some confusion to users). In this case there is a decoupling between the model time, which progresses steadily, and the actual date which the simulation relates to, which is determined by the specified forcing (annually repeating, representative of a fixed year). |
There are standard name 45 terms which have units of time (e.g. When used with a data variable, the cf-checker accepts all these variables with units of When used as a coordinate, the cf-checker only accepts these variables with absolute date units, which is generally nonsense. This effectively excludes, for entirely arbitrary and irrational reasons, 43 parameters from use as coordinates. |
To be absolutely clear, are you suggesting that sea_surface_wind_wave_mean_period might be used as a coordinate (e.g., for a histogram showing the distribution of periods)? If so, then I agree that units of "day", "second", "hour" etc. should all be acceptable. I don't think CF forbids this, so this is a problem with cf-checker. For the CMIP use case, are you suggesting that a scalar time dimension might be defined indicating an appropriate (approx.) date that applies to the forcing imposed in these experiments? That seems to me to be more a description of the experiment design as opposed to a property of a variable being written, but I suppose it is something to consider. |
@taylor13 Part of the issue may have to do with the interaction of standard_name and units. CF does state that a coordinate with a |
For reference, the current situation is that the sole necessary condition for a coordinate variable being a "time coordinate variable" is the presence of Is the aim of this issue just stop the checker complaining about a (perfectly valid) coordinate variable with a standard_name of .... or is there need to have Thanks, |
@davidhassell , unfortunately your first statement is currently untrue ... but if we can amend things that it becomes true that would solve the problem (option 2 in my initial post). The objective is indeed to have a valid coordinate variable with a standard name of A NetCDF file generated following CDL is not currently passed as valid:
The error message, The convention is not entirely explicit about what constitutes a "CF Time Axis" (though I can now see where Jim gets his interpretation from). It states, for instance, that:
Udunits does not require If, as @JimBiardCics says, it is intended that a "CF Time Axis" is identified by a unit string of the form I find it unsatisfactory that a variable with standard name |
Sorry coming to this a bit late..... I made a fix in February (cedadev/cf-checker#49) to correct identification of time coordinate variables so that it only identifies a variable as a time coordinate if one or more of the following are true:
Cheers, |
Hi @RosalynHatcher : thanks .. I was using an older version, and can't at the moment get version 3.1.1 to work .. but I'll follow that up on the cf-checker list. Is it still the case that a data variable with standard name @taylor13 , @JimBiardCics : do you agree with the interpretation of what constitutes a time axis given by Ros above? If so, can we update section 4.4 of the convention to say this? People outside CF often use a measure of elapsed time as a time axis: if we are excluding this, we need to be transparent about it. I can't see how this approach is going to lead to anything other than confusion. |
Hello All, I'd like to add a rider to my comment above, concerning the units we use for time and the standard name canonical units. The CF Convention says that Can we treat the suggestion that |
Some related discussion on |
From my reading, it appears that the discussion in this ticket arrived at conclusion (2) of @martinjuckes's original proposal: The convention should make it clear that a coordinate variable with To remedy this, I propose the following changes:
I propose that we append a second paragraph to this definition:
to
I note that discussion 304 mentions various other clarifications that are needed in Sect 4.4, and I hope that the above won't be inconsistent with the issue arising from that discussion. The above changes remedy a defect, rather than changing the intention of the convention, I believe. Nonetheless they're quite substantial, so it's safer to leave this labelled as What do you think, @martinjuckes, @davidhassell and other interested parties? (Jim Biard is unfortunately no longer among the CF community.) Best wishes Jonathan |
Overall I think these suggestions are good. However a canonical unit
|
Hello All, I think that providing a full canonical units of I'm a bit uncomfortable with the creating an Perhaps we shouldn't pre-empt a use case, but instead when one arises create a new name for each Thanks, |
Well, according to the discussion in https://github.com/orgs/cf-convention/discussions/304 Those are NOT elapsed times, but rather an encoding of timestamps. I'm not sure there is total consensus on that -- but it's close, it's either an encoding of timestamps, or an encoding of particular points on the time continuum (usually both), but not an elapsed time in any case. That is, the "since" part is critical. e.g. units of "seconds" is an elapsed time (timedelta), and units of "seconds since a_timestamp" is a timestamp (datetime) Not that this won't be very confusing to CF users .... |
Yes, I fully agree with @ChrisBarker-NOAA. @davidhassell : |
Hi Chris and Lars,
As I've found out! I do follow and subscribe to the "timestamp" ideas of https://github.com/orgs/cf-convention/discussions/304, but have caught myself out of context and need to think about this some more! Thanks for your patience, David |
In April I proposed some changes to the convention to reflect the conclusions of previous discussion in this issue. Some further discussion followed by @larsbarring, @davidhassell and @ChrisBarker-NOAA. Here is the proposal again, revised on the basis of those comments.
to
Is this acceptable? If there are no further concerns, we can accept this three weeks from now, on 14th September. Best wishes Jonathan |
Dear Jonathan, Thank you for this suggestion, I think the changes looks good. Still, I would like to give it a more careful consideration in the coming week or so, but I do not expect to have anything material to add. From my side the clock can start. Kind regards, |
I like it! Thanks! |
As anticipated I have no further comments(*) and this is a valuable clarification. Thanks Jonathan! (*) A really minor suggestion, if you think this might be an improvement:
|
I have created PR #538 to implement these changes. Thanks for your suggestion, Lars, which I have followed in a more explicit form: "The choice of reference time and date (midnight on 1st January 1958) is arbitrary and not restrictive." I hope that's OK. When I came to modify sect 4.4, it seemed to me that a different order for the sentences would be preferable from what was agreed above. In the PR, I have made it:
If there are no concerns expressed, this PR can be merged on 14th September. |
LGTM |
Thanks for your comment on the example, @ChrisBarker-NOAA. You're right that Mountain Daylight Time might not be the only name for that time zone! I suppose that Canada, Mexico, Antarctica and some Pacific islands are in the same longitude range. That text is a quotation from the man page of the |
Can we just change "i.e." tp "e.g." now -- and do more later, maybe ... |
@JonathanGregory, @ChrisBarker-NOAA : I agree that there is merit in keeping with the original wording directly taken from the UDUNITS documentation. But it is not a direct citation in a very formalistic sense, and that the documentation probably were written without specific consideration of an international audience, which CF should do. Moreover, again being formalistic, I think that the particular sentence is not included in proposed changes. I think we have three alternatives:
I do not have a preference here, but I would really like to avoid turning this into something that delays the acceptance of the really valuable clarification otherwise made by this proposal. |
I agree -- do not delay over this! Make the very small change, or don't -- either way, time to merge. I just saw a bit of copy editing that could done -- I did not mean to delay anything! Honestly, a lot of folks aren't quite clear on the difference between "i.e." and "e.g." -- I wan't that clar until fairly recently. So I think this is similar to a typo. But again -- don't delay over this! |
I also agree with the new text in the PR. Thanks, Jonathan et al. |
Thanks for your comments, all. In PR #538, I have replaced the offending sentence with "The time unit specification I assume we're all still happy with merging this on 14th, in the absence of any further concerns. |
LGTM -- thanks! |
Clarification of time coordinate requirements
Moderator: (not yet)
Requirement Summary: Clarification on what can be done with
forecast_period
as a time coordinate.Technical Proposal Summary: Either (1) the convention should accept
forecast_period
as a validstandard_name
for a time coordinate, which requires modification of some statements about theunits
of time coordinates or (2) the convention should make it clear that a coordinate variable withstandard_name
set toforecast_period
is not a time coordinate in the sense implied by section 4.4.Benefits: Users encoding forecast data who wish to use a
forecast_period
variable as a coordinate variable.Status Quo: If a coordinate variable has
standard_name
set toforecast_period
, the CF checker interprets this as a time coordinate. The CF checker also insists that time coordinate variables should haveunits
of the form<units> since <reference time>
. This form of units is not valid forforecast_period
, which is specifying an elapsed time and hence has no reference time. The consequence is that it is currently impossible to construct a netCDF file with aforecast_period
variable used as a coordinate variable.Detailed Proposal: My preference is option (1), which would accept that elapsed time can be a valid time coordinate. This would require modification of section 4.4 to explain different options depending on whether the time coordinate is (a) representing a date or (b) representing an elapsed time.
The text was updated successfully, but these errors were encountered: