-
Notifications
You must be signed in to change notification settings - Fork 168
Minimal v4 particleset #2060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimal v4 particleset #2060
Conversation
Signed-off-by: Joe Schoonover <joe@fluidnumerics.com>
VeckoTheGecko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a quick first pass - can do a more in depth review tomorrow
parcels/particleset.py
Outdated
| if isinstance(endtime, datetime): | ||
| raise NotImplementedError( | ||
| "If fieldset.time_interval is None, endtime must be a timedelta not a datetime" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this block needs to be removed (overloading of runtime param - which we discussed wouldn't work for analytical model output).
For context, a line of thought that was being explored was that endtime could either be the datetime type in the model output or a timedelta (to signify a runtime).
This wouldn't work, however, for analytical model output which would have model output in timedelta. Hence using endtime would be ambigious. Hence we opted for going with runtime still to delineate these two modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the field_timeinterval is None and the endtime is a datetime (line 768 immediately above this block in which this check lies), I don't see how we could envision calculating the duration of the simulation as there is no starting datetime from the fieldset to reference.
In any case, I'm reading that the suggested change here is to ensure that endtime has the same type as the fieldset. I'll go ahead an put in the bit for the runtime being a timedelta, otherwise, can't support the existing tests that are passing (stommel gyre has no time interval)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I'm reading that the suggested change here is to ensure that
endtimehas the same type as the fieldset
This ends up being quite straightforward - all we need to do here is:
runtime = endtime - time_interval.leftThat way:
- if
time_intervalis None, this would error out as expected - as long as endtime and the
time_intervalare compatible types (arithmetic is defined on it) then it will produce a timedelta object. This futureproofs us as well, and means that we just trust the packages we rely on.
No further checks need to be done - except some attention to explainable error messages. The following should be sufficient.
# I'll need to add the TimeLike alias
def get_runtime(endtime: TimeLike, time_interval: None | TimeInterval) -> timedelta:
if time_interval is None:
raise ValueError(f"FieldSet does not have a time interval. Can't calculate runtime from provided {endtime=!r}")
return endtime - time_interval.left
# in pset.execute...
runtime = get_runtime(endtime, fieldset.time_interval)Let me know if you want to wrap this into this PR - or if it should be done in a different one. I have no preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's easier to use runtime or endtime as the single source of truth in the simulation. Whatever is chosen, endtime = get_endtime(runtime, fieldset.time_interval) would be pretty similar logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, what I've done is to allow for either runtime or endtime to be sent in. The runtime is assumed to be a timedelta. endtime data type is either timedelta or datetime but must match the time datatype from the fieldset (as deduced from the time_interval.left data type. There's a few cases that are handled.
- If
runtimeandendtimeare both provided an error is thrown - If
runtimeis provided andendtimeis not provided, thestart_time(local variable used for controlling the forward stepping loop) is set totimedelta(seconds=0)and theend_time(local variable used for determining the lasttimevalue in the forward stepping loop) is set toruntime - If
runtimeis not provided andendtimeis provided, theendtimeis checked to match the type of thetime_interval.left. Thestart_timeis set totime_interval.leftand theend_timeis set to the minimum ofendtimeortime_interval.right - If
runtimeis not provided,endtimeis provided, and time_interval isNonethen an error is thrown.
The way I've implemented it surely is not the cleanest and tidiest, but it works. I'd say further cleanup should come in a following PR.
| time = np.array([np.datetime64(t) for t in time]) | ||
| if time.size > 0 and isinstance(time[0], np.timedelta64) and not self.time_origin: | ||
| raise NotImplementedError("If fieldset.time_origin is not a date, time of a particle must be a double") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message in the line above is not true anymore, right? Particle time cannot be a double?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I'll get this on my next round of edits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved in a55c852
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, now it doesn't support numpy.datetime64 or numpy.timedelta64 anymore. I fixed it in 68293ab
VeckoTheGecko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have looked through in full now.
|
Here's what I've gathered from the comments above
Some open questions
|
This test is marked with xfail since the changes required to support particlefile will require a significant overhaul of the particlefile module which is out of scope for this PR
|
I've added a test for using the outputfile, but this is currently failing with There are some larger changes required here that I feel are getting out of scope for this PR. |
|
I believe I've addressed a good deal of the issues that were raised. To help keep this PR moving, if it looks like I've resolved your issue, please resolve the comment. Otherwise, I will assume it is not resolved and work on it in my next round of edits. |
|
@VeckoTheGecko - you may find this interesting... when MITgcm runs without a calendar, the time index refers to the iteration count in the model, which is stored as an Note that the file this comes from is produced using MITgcm's built in NetCDF IO package; this is different than what is described in XGCM documention (e.g. here ), where the NetCDF file referenced is created by converting MITgcm's native MDS format to NetCDF using |
|
This PR is good to be merged, as far as I'm concerned |
I think we should just not support int time dimensions. If a user wants to use MITgcm runs without a calendar, they have to update the time dimension to timedelta or datetime/cftime on the xarray dataset level before passing to Parcels. We can have an explainable error message denoting the problem. |
Agreed - let's xfail it and it can be handled later. Good to introduce the test here though. |
|
To summarise:
After that I think we're good to merge. |
|
I have a couple issues in the v4 tests that are failing and I'll take care of resolving these in the morning. |
erikvansebille
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While testing my own run on this branch, I found some small bugs that I've either fixed or(see comments) or propose simple changes to below
| time = np.array([np.datetime64(t) for t in time]) | ||
| if time.size > 0 and isinstance(time[0], np.timedelta64) and not self.time_origin: | ||
| raise NotImplementedError("If fieldset.time_origin is not a date, time of a particle must be a double") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, now it doesn't support numpy.datetime64 or numpy.timedelta64 anymore. I fixed it in 68293ab
Co-authored-by: Erik van Sebille <e.vansebille@uu.nl>
…els into minimal-v4-particleset
|
Well, this is an odd thing to error out on : https://github.com/OceanParcels/Parcels/actions/runs/16056110569/job/45310801757?pr=2060#step:4:779 |
|
|
The |
|
Also, removing the |
|
@fluidnumerics-joe the problem is here: This looks to have been introduced in 7e7391d to help with dask arrays (with fe26424 being introduced so that fieldset could be None to help with debugging and to fix #867 ). I think this block of code can be safely removed. The kernel side of the code is something that we haven't touched yet.... perhaps its time we start looking at it somewhat EDIT: Correction |
|
Nice find. What's weird is this test was working just fine in 2de98a5 |
|
@VeckoTheGecko - good find. That resolved the issues on the v4 tests. Updating this branch and will re-request review |
erikvansebille
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to be merged so we can move forward!
mainfor v3 changes,v4-devfor v4 changes)This pull request provides a minimal implementation of the
ParticleSetfor execution as discussed in #2034 .Highlight of changes
repeatdtlogic is removed_pclasstype is removed for now; this currently means thatparticle.ei(encoded indices) are not stored and available for reuse. There's some discussion to be had around how this is incorporated cleanly and this will ultimately be brought in through a future PR.pset.executewith the unstructured stommel gyre generic dataset.