Minimal v4 particleset #2060

fluidnumerics-joe · 2025-06-30T15:31:47Z

Chose the correct base branch (main for v3 changes, v4-dev for v4 changes)
Added tests

This pull request provides a minimal implementation of the ParticleSet for execution as discussed in #2034 .

Highlight of changes

repeatdt logic is removed
Dynamic instantiation of the _pclass type is removed for now; this currently means that particle.ei (encoded indices) are not stored and available for reuse. There's some discussion to be had around how this is incorporated cleanly and this will ultimately be brought in through a future PR.
Particle-Particle interaction is removed in this version. My understanding that that interacting particles will be brought in once the issues around non-interacting particles settle down.
Time handling is done through datetime and timedelta objects
A new test is added for pset.execute with the unstructured stommel gyre generic dataset.
Notebook tutorial for using Parcels with unstructured Stommel gyre is updated

Signed-off-by: Joe Schoonover <joe@fluidnumerics.com>

VeckoTheGecko

Did a quick first pass - can do a more in depth review tomorrow

parcels/particleset.py

VeckoTheGecko · 2025-06-30T15:55:21Z

parcels/particleset.py

+            if isinstance(endtime, datetime):
+                raise NotImplementedError(
+                    "If fieldset.time_interval is None, endtime must be a timedelta not a datetime"
+                )


I think this block needs to be removed (overloading of runtime param - which we discussed wouldn't work for analytical model output).

For context, a line of thought that was being explored was that endtime could either be the datetime type in the model output or a timedelta (to signify a runtime).

This wouldn't work, however, for analytical model output which would have model output in timedelta. Hence using endtime would be ambigious. Hence we opted for going with runtime still to delineate these two modes.

If the field_timeinterval is None and the endtime is a datetime (line 768 immediately above this block in which this check lies), I don't see how we could envision calculating the duration of the simulation as there is no starting datetime from the fieldset to reference.

In any case, I'm reading that the suggested change here is to ensure that endtime has the same type as the fieldset. I'll go ahead an put in the bit for the runtime being a timedelta, otherwise, can't support the existing tests that are passing (stommel gyre has no time interval)

In any case, I'm reading that the suggested change here is to ensure that endtime has the same type as the fieldset

This ends up being quite straightforward - all we need to do here is:

runtime = endtime - time_interval.left

That way:

if time_interval is None, this would error out as expected

as long as endtime and the time_interval are compatible types (arithmetic is defined on it) then it will produce a timedelta object. This futureproofs us as well, and means that we just trust the packages we rely on.

No further checks need to be done - except some attention to explainable error messages. The following should be sufficient.

# I'll need to add the TimeLike alias def get_runtime(endtime: TimeLike, time_interval: None | TimeInterval) -> timedelta: if time_interval is None: raise ValueError(f"FieldSet does not have a time interval. Can't calculate runtime from provided {endtime=!r}") return endtime - time_interval.left # in pset.execute... runtime = get_runtime(endtime, fieldset.time_interval)

Let me know if you want to wrap this into this PR - or if it should be done in a different one. I have no preference.

I'm not sure if it's easier to use runtime or endtime as the single source of truth in the simulation. Whatever is chosen, endtime = get_endtime(runtime, fieldset.time_interval) would be pretty similar logic

Right now, what I've done is to allow for either runtime or endtime to be sent in. The runtime is assumed to be a timedelta. endtime data type is either timedelta or datetime but must match the time datatype from the fieldset (as deduced from the time_interval.left data type. There's a few cases that are handled.

If runtime and endtime are both provided an error is thrown

If runtime is provided and endtime is not provided, the start_time (local variable used for controlling the forward stepping loop) is set to timedelta(seconds=0) and the end_time (local variable used for determining the last time value in the forward stepping loop) is set to runtime

If runtime is not provided and endtime is provided, the endtime is checked to match the type of the time_interval.left. The start_time is set to time_interval.left and the end_time is set to the minimum of endtime or time_interval.right

If runtime is not provided, endtime is provided, and time_interval is None then an error is thrown.

The way I've implemented it surely is not the cleanest and tidiest, but it works. I'd say further cleanup should come in a following PR.

tests/v4/test_particleset.py

parcels/particleset.py

erikvansebille · 2025-06-30T16:03:04Z

parcels/particleset.py

            time = np.array([np.datetime64(t) for t in time])
        if time.size > 0 and isinstance(time[0], np.timedelta64) and not self.time_origin:
            raise NotImplementedError("If fieldset.time_origin is not a date, time of a particle must be a double")
+


The error message in the line above is not true anymore, right? Particle time cannot be a double?

You're right. I'll get this on my next round of edits

Resolved in a55c852

No, now it doesn't support numpy.datetime64 or numpy.timedelta64 anymore. I fixed it in 68293ab

parcels/particleset.py

VeckoTheGecko

Ok, I have looked through in full now.

parcels/particleset.py

fluidnumerics-joe · 2025-07-01T14:26:21Z

Here's what I've gathered from the comments above

The endtime input must match the same type as the time dimension in the fields of the underlying fieldset. From what I can tell, the most straightforward way to get the type of the time dimension is to get the type of one of the time_interval endpoints.
We need to bring back runtime, which is required to be a timedelta object.
We need to enforce either endtime or runtime being set. I would argue that if there is a valid time_interval, we could default to using the valid time_timerval
All time handling should remain as datetime or timedelta objects; in either case, it should be consistent with the time dimension type coming from the underlying dataset.

Some open questions

What are the data types for the time dimension for idealized simulations that don't leverage a calendar system ? Do they actually come in as timedelta or something else? I guess I'll find out later today with this MITgcm example I'll send across in Adding real-world circulation models datasets #2053

This test is marked with xfail since the changes required to support particlefile will require a significant overhaul of the particlefile module which is out of scope for this PR

fluidnumerics-joe · 2025-07-01T15:23:11Z

I've added a test for using the outputfile, but this is currently failing with


tests/v4/test_particleset.py:87: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
parcels/particleset.py:648: in ParticleFile
    return ParticleFile(*args, particleset=self, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
parcels/particlefile.py:57: in __init__
    self._parcels_mesh = self.particleset.fieldset.gridset.grids[0].mesh
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <parcels.fieldset.FieldSet object at 0x7f4de06ffda0>, name = 'gridset'

    def __getattr__(self, name):
        """Get the field by name. If the field is not found, check if it's a constant."""
        if name in self.fields:
            return self.fields[name]
        elif name in self.constants:
            return self.constants[name]
        else:
>           raise AttributeError(f"FieldSet has no attribute '{name}'")
E           AttributeError: FieldSet has no attribute 'gridset'

parcels/fieldset.py:68: AttributeError
=========================================================================================================================== short test summary info ============================================================================================================================
FAILED tests/v4/test_particleset.py::test_uxstommelgyre_pset_execute_output - AttributeError: FieldSet has no attribute 'gridset'

There are some larger changes required here that I feel are getting out of scope for this PR.

fluidnumerics-joe · 2025-07-01T15:47:52Z

I believe I've addressed a good deal of the issues that were raised. To help keep this PR moving, if it looks like I've resolved your issue, please resolve the comment. Otherwise, I will assume it is not resolved and work on it in my next round of edits.

fluidnumerics-joe · 2025-07-01T16:08:24Z

@VeckoTheGecko - you may find this interesting... when MITgcm runs without a calendar, the time index refers to the iteration count in the model, which is stored as an int32 in the xarray object. See #2053 (comment)

Note that the file this comes from is produced using MITgcm's built in NetCDF IO package; this is different than what is described in XGCM documention (e.g. here ), where the NetCDF file referenced is created by converting MITgcm's native MDS format to NetCDF using xmitgcm . It appears we may need to coerce time to be a dimension and coordinate with knowledge of the parent simulation timestep, if reading NetCDF output straight from MITgcm.

erikvansebille · 2025-07-02T06:02:05Z

This PR is good to be merged, as far as I'm concerned

VeckoTheGecko · 2025-07-02T07:39:01Z

It appears we may need to coerce time to be a dimension and coordinate with knowledge of the parent simulation timestep, if reading NetCDF output straight from MITgcm.

I think we should just not support int time dimensions. If a user wants to use MITgcm runs without a calendar, they have to update the time dimension to timedelta or datetime/cftime on the xarray dataset level before passing to Parcels. We can have an explainable error message denoting the problem.

VeckoTheGecko · 2025-07-02T08:34:42Z

I've added a test for using the outputfile, but this is currently failing with
...
There are some larger changes required here that I feel are getting out of scope for this PR.

Agreed - let's xfail it and it can be handled later. Good to introduce the test here though.

VeckoTheGecko · 2025-07-02T09:45:12Z

To summarise:

Docstrings/param list/NotImplementedError Minimal v4 particleset #2060 (comment)
runtime/endtime (maybe just mark one as NotImplemented - or you can go with an implementation if its straightforward) Minimal v4 particleset #2060 (comment)
xfail outputfile test Minimal v4 particleset #2060 (comment)

After that I think we're good to merge.

This reverts commit 1c1cbe3.

fluidnumerics-joe · 2025-07-03T01:14:35Z

I have a couple issues in the v4 tests that are failing and I'll take care of resolving these in the morning.

erikvansebille

While testing my own run on this branch, I found some small bugs that I've either fixed or(see comments) or propose simple changes to below

parcels/particleset.py

erikvansebille · 2025-07-03T11:15:37Z

parcels/particleset.py

            time = np.array([np.datetime64(t) for t in time])
        if time.size > 0 and isinstance(time[0], np.timedelta64) and not self.time_origin:
            raise NotImplementedError("If fieldset.time_origin is not a date, time of a particle must be a double")
+


No, now it doesn't support numpy.datetime64 or numpy.timedelta64 anymore. I fixed it in 68293ab

parcels/particleset.py

Co-authored-by: Erik van Sebille <e.vansebille@uu.nl>

…els into minimal-v4-particleset

fluidnumerics-joe · 2025-07-03T18:35:51Z

Well, this is an odd thing to error out on : https://github.com/OceanParcels/Parcels/actions/runs/16056110569/job/45310801757?pr=2060#step:4:779

VeckoTheGecko · 2025-07-04T09:10:07Z

Well, this is an odd thing to error out on : https://github.com/OceanParcels/Parcels/actions/runs/16056110569/job/45310801757?pr=2060#step:4:779

data looks to be a numpy array and not an xarray dataarray - hence doesn't have a .values attribute - could that be it?

fluidnumerics-joe · 2025-07-04T11:38:31Z

The field that is passed into the interpolator data attribute is a uxarray.dataarray. This hasn't changed at all in the generic dataset. an it's suddenly failing?

fluidnumerics-joe · 2025-07-04T11:44:46Z

Also, removing the .values fails other tests, since (again). field.data is a DataArray

VeckoTheGecko · 2025-07-04T12:04:04Z

@fluidnumerics-joe the problem is here:

https://github.com/OceanParcels/Parcels/blob/d544d4c66d63287644efa83fe9fd6a16d63022d6/parcels/kernel.py#L320

This looks to have been introduced in 7e7391d to help with dask arrays (with fe26424 being introduced so that fieldset could be None to help with debugging and to fix #867 ). I think this block of code can be safely removed.

The kernel side of the code is something that we haven't touched yet.... perhaps its time we start looking at it somewhat

EDIT: Correction

fluidnumerics-joe · 2025-07-05T14:14:02Z

Nice find. What's weird is this test was working just fine in 2de98a5

fluidnumerics-joe · 2025-07-07T13:03:07Z

@VeckoTheGecko - good find. That resolved the issues on the v4 tests. Updating this branch and will re-request review

erikvansebille

Happy to be merged so we can move forward!

fluidnumerics-joe added 4 commits June 30, 2025 10:57

[#2034] Remove dynamic pclass typing in particleset.init

d7ff31e

Signed-off-by: Joe Schoonover <joe@fluidnumerics.com>

[#2034] Remove repeatdt references

1c1cbe3

[#2034] Add test for particleset.execute on unstructure stommel gyre

f2531bf

Update notebook example for stommel gyre on unstructured grid

2de98a5

fluidnumerics-joe requested review from VeckoTheGecko and erikvansebille June 30, 2025 15:31

github-project-automation bot added this to Parcels development Jun 30, 2025

github-project-automation bot moved this to Backlog in Parcels development Jun 30, 2025

Reduce number of particles and simulation time for example

d3786da

VeckoTheGecko reviewed Jun 30, 2025

View reviewed changes

erikvansebille reviewed Jun 30, 2025

View reviewed changes

VeckoTheGecko reviewed Jul 1, 2025

View reviewed changes

parcels/particleset.py Show resolved Hide resolved

parcels/particleset.py Outdated Show resolved Hide resolved

VeckoTheGecko mentioned this pull request Jul 1, 2025

Update internal model to treat depth as positive upwards #2063

Open

fluidnumerics-joe added 3 commits July 1, 2025 10:58

Add runtime option to particleset execute

22cbf0f

Fix time update in loop

c6ea5c6

Add test with output; mark as xfail

87277c7

This test is marked with xfail since the changes required to support particlefile will require a significant overhaul of the particlefile module which is out of scope for this PR

VeckoTheGecko mentioned this pull request Jul 2, 2025

Explainable error message that Field time dimension must be timedelta/datetime #2064

Open

VeckoTheGecko mentioned this pull request Jul 2, 2025

ParticleSet execute loop #2065

Open

fluidnumerics-joe added 4 commits July 2, 2025 10:11

Revert "[#2034] Remove repeatdt references"

f979749

This reverts commit 1c1cbe3.

Add NotImplementedError for repeatdt

1734113

Fix error message; time must be datetime or date object

a55c852

Merge remote-tracking branch 'origin/v4-dev' into minimal-v4-particleset

4415a32

erikvansebille added 3 commits July 3, 2025 13:13

Fixing that fieldset.tinme_interval doesn't return a tuple

487a047

Support time to be numpy.datetime64 or numpu.timedelta64

68293ab

Using np.timedelta64 for internal time model

399ae10

erikvansebille reviewed Jul 3, 2025

View reviewed changes

VeckoTheGecko mentioned this pull request Jul 3, 2025

Remove postIterationCallbacks from ParticleSet.execute() #1911

Closed

1 task

fluidnumerics-joe and others added 5 commits July 3, 2025 12:23

Update calculation of ngrid

bb9a0c0

Co-authored-by: Erik van Sebille <e.vansebille@uu.nl>

Support time being None or timedelta object

9ebb44b

Merge branch 'minimal-v4-particleset' of github.com:OceanParcels/Parc…

6769567

…els into minimal-v4-particleset

Fix initialization for time is None

02e57bb

Set time to timedelta

0ef747e

Merge branch 'v4-dev' into minimal-v4-particleset

d544d4c

VeckoTheGecko mentioned this pull request Jul 4, 2025

Create an inventory of features to drop in v4 #1844

Open

24 tasks

fluidnumerics-joe and others added 2 commits July 7, 2025 08:07

Remove conversion of dataarray to numpy array

5dbe71d

Merge branch 'v4-dev' into minimal-v4-particleset

aac41e8

Merge branch 'v4-dev' into minimal-v4-particleset

e6067c9

fluidnumerics-joe requested review from VeckoTheGecko and erikvansebille July 7, 2025 14:18

erikvansebille approved these changes Jul 7, 2025

View reviewed changes

fluidnumerics-joe merged commit 7d2157c into v4-dev Jul 7, 2025
8 checks passed

fluidnumerics-joe deleted the minimal-v4-particleset branch July 7, 2025 14:28

github-project-automation bot moved this from Backlog to Done in Parcels development Jul 7, 2025

Minimal v4 particleset #2060

Minimal v4 particleset #2060

Uh oh!

Conversation

fluidnumerics-joe commented Jun 30, 2025

Highlight of changes

Uh oh!

VeckoTheGecko left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VeckoTheGecko left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fluidnumerics-joe commented Jul 1, 2025

Uh oh!

fluidnumerics-joe commented Jul 1, 2025

Uh oh!

fluidnumerics-joe commented Jul 1, 2025

Uh oh!

fluidnumerics-joe commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erikvansebille commented Jul 2, 2025

Uh oh!

VeckoTheGecko commented Jul 2, 2025

Uh oh!

VeckoTheGecko commented Jul 2, 2025

Uh oh!

VeckoTheGecko commented Jul 2, 2025

Uh oh!

fluidnumerics-joe commented Jul 3, 2025

Uh oh!

erikvansebille left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fluidnumerics-joe commented Jul 3, 2025

Uh oh!

VeckoTheGecko commented Jul 4, 2025

Uh oh!

fluidnumerics-joe commented Jul 4, 2025

Uh oh!

fluidnumerics-joe commented Jul 4, 2025

Uh oh!

VeckoTheGecko commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fluidnumerics-joe commented Jul 5, 2025

fluidnumerics-joe commented Jul 1, 2025 •

edited

Loading

VeckoTheGecko commented Jul 4, 2025 •

edited

Loading