Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GMTDataArrayAccessor: Fallback to default grid registration and gtype if the grid source file doesn't exist #2009

Merged
merged 15 commits into from
Feb 20, 2023

Conversation

seisman
Copy link
Member

@seisman seisman commented Jul 16, 2022

Description of proposed changes

If the netCDF source file doesn't exist, running grdinfo on the netCDF source file gives the following error:

grdinfo [ERROR]: Cannot find file /tmp/xxx/xxx.nc
grdinfo [ERROR]: Must specify one or more input files

The error is unfriendly and confusing. Actually, it makes no sense if the grid source file doesn't exist. In this case, it should fallback to registration=0 and gtype=0 unless we implement the idea in #2194 (comment).

Partially address #1984 and #524 (comment)

Please note that this PR doesn't solve the problems in #1984 and #524, because the registration and gtype information are incorrect, but at least it doesn't report confusing errors.

@seisman seisman added the bug Something isn't working label Jul 16, 2022
@seisman seisman added this to the 0.8.0 milestone Jul 16, 2022
@seisman seisman requested review from weiji14 and maxrjones July 16, 2022 08:46
@seisman
Copy link
Member Author

seisman commented Jul 25, 2022

Ping @maxrjones and @weiji14 for reviewing.

pygmt/accessors.py Outdated Show resolved Hide resolved
Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
@seisman
Copy link
Member Author

seisman commented Jul 26, 2022

I've added a unit test in 37869d4, and the test helps find a new bug.

At the end of the test, I manually set the registration and gtype information and expect the two assert statements to pass, but they don't. So there must be another bug.

    # manually set the registration and gtype
    dataset.height.gmt.registration = 1
    dataset.height.gmt.gtype = 1
    # the registration and gtype should be correct now.
    assert dataset.height.gmt.registration == 1
    assert dataset.height.gmt.gtype == 1

@seisman seisman changed the title Let GMTDataArrayAccessor check if the grid source file exists WIP: Let GMTDataArrayAccessor check if the grid source file exists Jul 27, 2022
@seisman
Copy link
Member Author

seisman commented Jul 27, 2022

I've added a unit test in 37869d4, and the test helps find a new bug.

At the end of the test, I manually set the registration and gtype information and expect the two assert statements to pass, but they don't. So there must be another bug.

    # manually set the registration and gtype
    dataset.height.gmt.registration = 1
    dataset.height.gmt.gtype = 1
    # the registration and gtype should be correct now.
    assert dataset.height.gmt.registration == 1
    assert dataset.height.gmt.gtype == 1

The documentation of xarray accessor explains why they don't work as expected:

Accessors are created once per DataArray and Dataset instance. New instances, like those created from arithmetic operations or when accessing a DataArray from a Dataset (ex. ds[var_name]), will have new accessors created.

So every time dataset.height.gmt is used, the __init__ function is called. It's the limitation of the xarray accessors and I think we can do nothing here.

@seisman seisman changed the title WIP: Let GMTDataArrayAccessor check if the grid source file exists Let GMTDataArrayAccessor check if the grid source file exists Jul 27, 2022
@seisman seisman force-pushed the check-dataarray-source branch from c769028 to 7b0520a Compare July 27, 2022 09:56
@weiji14
Copy link
Member

weiji14 commented Jul 27, 2022

The documentation of xarray accessor explains why they don't work as expected:

Accessors are created once per DataArray and Dataset instance. New instances, like those created from arithmetic operations or when accessing a DataArray from a Dataset (ex. ds[var_name]), will have new accessors created.

So every time dataset.height.gmt is used, the __init__ function is called. It's the limitation of the xarray accessors and I think we can do nothing here.

The accessor dataset.height.gmt is created once, but the variable or property should be able to be modified using a setter method. Still think this is an issue, but need time to look into it. Don't we have some tests that actually modify the accessor values?

@seisman seisman modified the milestones: 0.8.0, 0.9.0 Dec 11, 2022
@seisman seisman changed the title Let GMTDataArrayAccessor check if the grid source file exists GMTDataArrayAccessor: Fallback to default grid registration and gtype if the grid source file doesn't exist Feb 16, 2023
@seisman
Copy link
Member Author

seisman commented Feb 16, 2023

I've added a unit test in 37869d4, and the test helps find a new bug.
At the end of the test, I manually set the registration and gtype information and expect the two assert statements to pass, but they don't. So there must be another bug.

    # manually set the registration and gtype
    dataset.height.gmt.registration = 1
    dataset.height.gmt.gtype = 1
    # the registration and gtype should be correct now.
    assert dataset.height.gmt.registration == 1
    assert dataset.height.gmt.gtype == 1

The documentation of xarray accessor explains why they don't work as expected:

Accessors are created once per DataArray and Dataset instance. New instances, like those created from arithmetic operations or when accessing a DataArray from a Dataset (ex. ds[var_name]), will have new accessors created.

So every time dataset.height.gmt is used, the __init__ function is called. It's the limitation of the xarray accessors and I think we can do nothing here.

The documentation of xarray accessor explains why they don't work as expected:

Accessors are created once per DataArray and Dataset instance. New instances, like those created from arithmetic operations or when accessing a DataArray from a Dataset (ex. ds[var_name]), will have new accessors created.

So every time dataset.height.gmt is used, the __init__ function is called. It's the limitation of the xarray accessors and I think we can do nothing here.

The accessor dataset.height.gmt is created once, but the variable or property should be able to be modified using a setter method. Still think this is an issue, but need time to look into it. Don't we have some tests that actually modify the accessor values?

OK. I've removed this test and added a simpler test instead. The old complicated test was moved to #1984 so that we can work on this later.

@seisman seisman requested a review from weiji14 February 16, 2023 14:51
@seisman seisman added the needs review This PR has higher priority and needs review. label Feb 16, 2023
Comment on lines 32 to 45
try:
self._source = self._obj.encoding["source"] # filepath to NetCDF source
if not Path(self._source).exists():
raise FileNotFoundError(
f"Grid source file {self._source} doesn't exist."
)
# Get grid registration and grid type from the last two columns of
# the shortened summary information of `grdinfo`.
self._registration, self._gtype = map(
int, grdinfo(self._source, per_column="n").split()[-2:]
)
except (KeyError, ValueError):
except (KeyError, ValueError, FileNotFoundError):
self._registration = 0 # Default to Gridline registration
self._gtype = 0 # Default to Cartesian grid type
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should change these lines to the following codes, which I think are more readable:

        self._source = self._obj.encoding.get("source")
        if self._source is not None and Path(self._source).exists():
            try:
                # Get grid registration and grid type from the last two columns of
                # the shortened summary information of `grdinfo`.
                self._registration, self._gtype = map(
                    int, grdinfo(self._source, per_column="n").split()[-2:]
                )
            except ValueError:
                self._registration = 0
                self._gtype = 0
        else:
            self._registration = 0
            self._gtype = 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we will silently fallback to registration=0 and gtype=0 if grdinfo cannot determine the grid registration and type? Ok I guess, but please add the code comments back 🙂

@seisman
Copy link
Member Author

seisman commented Feb 20, 2023

Ping @GenericMappingTools/pygmt-maintainers for review.

pygmt/tests/test_accessor.py Outdated Show resolved Hide resolved
pygmt/tests/test_accessor.py Outdated Show resolved Hide resolved
seisman and others added 2 commits February 20, 2023 13:01
Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, 2 more typos, otherwise ok.

pygmt/tests/test_accessor.py Outdated Show resolved Hide resolved
pygmt/tests/test_accessor.py Outdated Show resolved Hide resolved
seisman and others added 2 commits February 20, 2023 13:08
Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>
@seisman seisman added final review call This PR requires final review and approval from a second reviewer and removed needs review This PR has higher priority and needs review. labels Feb 20, 2023
Copy link
Member

@michaelgrund michaelgrund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor consistency changes, otherwise looks good!

pygmt/tests/test_accessor.py Outdated Show resolved Hide resolved
pygmt/tests/test_accessor.py Outdated Show resolved Hide resolved
pygmt/tests/test_accessor.py Outdated Show resolved Hide resolved
Co-authored-by: Michael Grund <23025878+michaelgrund@users.noreply.github.com>
@seisman seisman merged commit ad0d6ad into main Feb 20, 2023
@seisman seisman deleted the check-dataarray-source branch February 20, 2023 12:15
@seisman seisman removed the final review call This PR requires final review and approval from a second reviewer label Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants