Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support various datetime types as input #464

Merged
merged 18 commits into from
Jul 13, 2020
Merged

Support various datetime types as input #464

merged 18 commits into from
Jul 13, 2020

Conversation

seisman
Copy link
Member

@seisman seisman commented May 30, 2020

Description of proposed changes

This PR allows PyGMT accept vectors of various datetime types so that PyGMT can plot datetime axis.

Vectors of following datetime types are supported:

TODO:

  • Wait for the GMT 6.1.0 release
  • Add more tests
  • Add a tutorial for calendar axis
  • Test for xarray.DataArray

Known issues but won't be fixed in this PR:

  • The region argument doesn't work with numpy datetime64 objects, have to convert to a string using np.datetime_as_string in order to set the map frame bounds.
  • Support timedelta64, e.g. GPS time from a certain epoch.

Fixes #242.

Example script:

import datetime
import pygmt
import numpy as np
import pandas as pd

fig = pygmt.Figure()
fig.basemap(projection="X15c/5c", region="2010-01-01/2020-01-01/0/10", frame=True)

# numpy.datetime64 types
x = np.array(
    ["2010-06-01", "2011-06-01T12", "2012-01-01T12:34:56"], dtype="datetime64"
)
y = [1.0, 2.0, 3.0]
fig.plot(x, y, style="c0.2c", pen="1p")

# pandas.DatetimeIndex
x = pd.date_range("2013", freq="YS", periods=3)
y = [4, 5, 6]
fig.plot(x, y, style="t0.2c", pen="1p")

# xarray.DataArray
x = xr.DataArray(data=pd.date_range(start="2015-03", freq="QS", periods=3))
y = [7.5, 6, 4.5]
fig.plot(x, y, style="s0.2c", pen="1p")

# raw datetime strings
x = ["2016-02-01", "2017-03-04T00:00"]
y = [7, 8]
fig.plot(x, y, style="a0.2c", pen="1p")

# the Python built-in datetime and date
x = [datetime.date(2018, 1, 1), datetime.datetime(2019, 1, 1)]
y = [8.5, 9.5]
fig.plot(x, y, style="i0.2c", pen="1p")

fig.savefig("datetime.pdf")

Output:

test_plot_datetime

Reminders

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst.
  • Write detailed docstrings for all functions/methods.
  • If adding new functionality, add an example to docstrings or tutorials.

@seisman
Copy link
Member Author

seisman commented May 30, 2020

@GenericMappingTools/python @GenericMappingTools/python-contributors This PR is almost done. If you have the GMT latest master branch installed, please try this branch and leave your comments.

@seisman seisman added the feature Brand new feature label May 30, 2020
@seisman seisman added this to the 0.2.x milestone May 30, 2020
@weiji14
Copy link
Member

weiji14 commented May 31, 2020

Some instructions on how to test this branch:

curl https://raw.githubusercontent.com/GenericMappingTools/gmt/master/ci/build-gmt-master.sh | bash
conda activate --name pygmt-test  # or some virtual env you have
pip install https://github.com/GenericMappingTools/pygmt/archive/datetime-input.zip

and then in python:

import os

os.environ["GMT_LIBRARY_PATH"] = "/home/username/gmt-install-dir/lib"

import pygmt

pygmt.print_clib_info()  # check that you're using GMT from master

and then try to test out plot with datetimes! I'll have a go at this in a bit with some 'real' data.

@weiji14

This comment has been minimized.

@seisman

This comment has been minimized.

@weiji14
Copy link
Member

weiji14 commented Jun 1, 2020

Ok, I tried installing it in a different conda environment and it works now! Below is my convoluted real-data example from Antarctica (longitude=46.522298201524514, latitude=-73.05774246536473).

height_over_time

Code:

# %%
import os

os.environ["GMT_LIBRARY_PATH"] = "/home/username/gmt-install-dir/lib"

import numpy as np
import pygmt

pygmt.show_versions()

# %%
x = np.array(
    [
        np.datetime64(dt)
        for dt in [
            "NaT",
            "2019-01-28T07:51:16.582496785",
            "2019-04-29T03:30:57.620034382",
            "2019-07-28T23:10:36.096639410",
            "2019-10-27T18:50:32.615134843",
            "2020-01-26T14:30:18.835180975",
        ]
    ]
)
y = np.array(
    [np.NaN, 3126.60298909, 3126.67885045, 3126.67276984, 3126.7217197, 3126.68279204,]
)

# %%
# Mask out NaT values
mask = ~np.isnan(y)
x = x[mask]
y = y[mask]

# %%
dates = np.datetime_as_string(x, unit="ns")
region = [dates[0], dates[-1], np.nanmin(y) - 0.1, np.nanmax(y) + 0.1]

# %%
fig = pygmt.Figure()
fig.basemap(
    projection="X15c/5c",
    region=region,
    frame=["WSne", "xaf+lDateTime", "yaf+lHeight(m)"],
)
fig.plot(x=x, y=y, style="t1c", pen="1p")
fig.savefig("height_over_time.png")
fig.show()

Glad to see that these work off the shelf:

  • Passing in xarray.DataArrays to plot works (thanks to NEP18's __array__ functionality).
  • ISO datetimes down to nanoseconds are supported by GMT!

Places to improve:

  • The region argument doesn't work with numpy datetime64 objects, have to convert to a string using np.datetime_as_string in order to set the map frame bounds.
  • Having even just one NaT value will raise an error, as mentioned in Support datetime data types as input #242 (comment), should be a warning ideally, but it's an upstream issue. Can be tested by commenting out the 'mask' code above.
  • Support timedelta64, e.g. GPS time from a certain epoch. More of a nice-to-have really, could be done in a separate PR once this datetime one is merged in.

@seisman
Copy link
Member Author

seisman commented Jun 1, 2020

  • Passing in xarray.DataArrays to plot works (thanks to NEP18's __array__ functionality).

I'll add it to the test

  • ISO datetimes down to nanoseconds are supported by GMT!

Perhaps not true. GMT converts ISO datetime to double values internally. 2019-01-28 07:51:16.582496785 is stored as a double value 1548661876.582496881484985 (printed with %30.15lf). So GMT only supports up to microsecond, as limited by the precision of double type. Maybe I'm wrong here.

  • The region argument doesn't work with numpy datetime64 objects, have to convert to a string using np.datetime_as_string in order to set the map frame bounds.

The decorator kwargs_to_strings convert the list [w, e, s, n] to a string w/e/s/n. We need to convert datetime64 object to string internally, but it's not ideal to do the conversion in the decorator. I don't have a good solution now.

  • Having even just one NaT value will raise an error, as mentioned in #242 (comment), should be a warning ideally, but it's an upstream issue. Can be tested by commenting out the 'mask' code above.

Just opened an issue (GenericMappingTools/gmt#3414) in the upstream repository.

  • Support timedelta64, e.g. GPS time from a certain epoch. More of a nice-to-have really, could be done in a separate PR once this datetime one is merged in.

Perhaps we need to convert timedelta64 to double values before passing to GMT. Leave it for a separate PR.

@weiji14
Copy link
Member

weiji14 commented Jun 1, 2020

  • ISO datetimes down to nanoseconds are supported by GMT!

Perhaps not true. GMT converts ISO datetime to double values internally. 2019-01-28 07:51:16.582496785 is stored as a double value 1548661876.582496881484985 (printed with %30.15lf). So GMT only supports up to microsecond, as limited by the precision of double type. Maybe I'm wrong here.

Ideally there would be a warning, but I'm not sure if it matters. The data will be plotted in pretty much the same position anyway (unless someone looks at changes every microsecond?) so it's low priority.

Just opened an issue (GenericMappingTools/gmt#3414) in the upstream repository.

Ok, I've tested the 'new' GMT master with GenericMappingTools/gmt#3415 merged in, and it's giving a warning (as expected), but the plot shows up now 🙌

height over time, handling NaTs

Code (without mask!):

import os

os.environ["GMT_LIBRARY_PATH"] = os.path.join(os.environ["HOME"], "gmt-install-dir/lib")

import numpy as np
import pygmt

pygmt.show_versions()

# %%
x = np.array(
    [
        np.datetime64(dt)
        for dt in [
            "NaT",
            "2019-01-28T07:51:16.582496785",
            "2019-04-29T03:30:57.620034382",
            "2019-07-28T23:10:36.096639410",
            "2019-10-27T18:50:32.615134843",
            "2020-01-26T14:30:18.835180975",
        ]
    ]
)
y = np.array(
    [np.NaN, 3126.60298909, 3126.67885045, 3126.67276984, 3126.7217197, 3126.68279204,]
)

# %%
dates = np.datetime_as_string(x, unit="ns")
region = [dates[1], dates[-1], np.nanmin(y) - 0.1, np.nanmax(y) + 0.1]

# %%
fig = pygmt.Figure()
fig.basemap(
    projection="X15c/5c",
    region=region,
    frame=["WSne", "xaf+lDateTime", "yaf+lHeight(m)"],
)
fig.plot(x=x, y=y, style="t1c", pen="1p")
fig.savefig("height_over_time.png")
fig.show()

…andas.DateTimeIndex and numpy.datetime64 types
@seisman
Copy link
Member Author

seisman commented Jun 4, 2020

Passing in xarray.DataArrays to plot works (thanks to NEP18's array functionality).

@weiji14 Could you help provide a simple example for this?

And what's the best place to add some examples for calendar plots?

@seisman seisman changed the title WIP: Support various datetime types as input Support various datetime types as input Jul 12, 2020
Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems about ready, and it would be nice to merge this before we start working on #520 properly to avoid the conflicts. I'll help to fix the tests in a few minutes, seems to be just whitespace and a wrong error captured.

np.int32: "GMT_INT",
np.uint64: "GMT_ULONG",
np.uint32: "GMT_UINT",
np.datetime64: "GMT_DATETIME",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question, is there a GMT dtype for string types? I can't seem to find one at https://docs.generic-mapping-tools.org/6.1/api.html#gmt-c-api.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a GMT_TEXT, but it may not related to the GMT_Put_Strings function.

@seisman
Copy link
Member Author

seisman commented Jul 13, 2020

I'll help to fix the tests in a few minutes, seems to be just whitespace and a wrong error captured.

Thanks for help improving this PR. There are still something to do (e.g., add examples, pass a datetime region as a list), but we can merge this PR and open issues for the missing feature.

Copy link
Member

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still something to do (e.g., add examples, pass a datetime region as a list), but we can merge this PR and open issues for the missing feature.

Yep, we can add those tutorials and enhancements later 🚀

@weiji14 weiji14 marked this pull request as ready for review July 13, 2020 23:22
@weiji14 weiji14 merged commit f401f85 into master Jul 13, 2020
@weiji14 weiji14 deleted the datetime-input branch July 13, 2020 23:51
weiji14 added a commit that referenced this pull request Sep 21, 2020
….DataFrame tables (#619)

Changes the backend mechanism of `info`
from using lib.virtualfile_from_matrix()
(which only supports single non-datetime dtypes)
to using lib.virtualfile_from_vectors()
(which supports datetime inputs as of #464).

* Refactor info to use virtualfile_from_vectors to support datetime inputs
* Test that xarray.Dataset inputs into pygmt.info works too
* Expect failures on test_info_*_time_column on GMT 6.1.1
* Document xarray.Datasets with 1D data_vars as allowed inputs to info

Co-authored-by: Dongdong Tian <seisman.info@gmail.com>
michaelgrund added a commit to michaelgrund/pygmt that referenced this pull request Jan 1, 2021
Based on GenericMappingTools#464 and GenericMappingTools#549 I prepared a gallery example for plotting datetime inputs.
@weiji14 weiji14 mentioned this pull request Dec 16, 2023
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Brand new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support datetime data types as input
3 participants