Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slicing and Sampling in Time-Dimensions #886

Closed
mrksr opened this issue Sep 26, 2016 · 5 comments · Fixed by #3197
Closed

Slicing and Sampling in Time-Dimensions #886

mrksr opened this issue Sep 26, 2016 · 5 comments · Fixed by #3197

Comments

@mrksr
Copy link
Contributor

mrksr commented Sep 26, 2016

Problem statement

It is quite common for scientific data to consist of series for which on key dimension is time. To represent specific points in time, numpy supports a dtype called datetime64. Holoviews can handle datasets which contain time information. However, slicing and sampling in these dimensions is tricky. Consider the following data set which contains samples for every fifteen minutes of a day.

import numpy as np
import holoviews as hv
import pandas as pd

df = pd.DataFrame(
    {"noise": np.random.normal(size=(24*4,))},
    index=pd.date_range('2016-09-26', periods=24*4, freq='15Min')
)
df.index.name = "time"

hvset = hv.Dataset(df.reset_index(), kdims=["time"])

Using the pandas dataframe directly, time ranges can be selected using normal slicing.

df["2016-09-26 16:00":"2016-09-26 18:00"]

This requires some code specific to time stamps which converts the date strings to the correct data type.

Slicing in Holoviews

Given two numpy.datetime64-objects date_from and date_to, slicing of the Holoviews dataset works as usual.

sliced = hvset.select(time=(date_from, date_to))

There are multiple ways of creating such objects, the most convenient one being the the pandas API.

Slicing using pandas

The pandas timeseries API offers the pd.Timestamp-object, which can be constructed from different time descriptions such as strings or datetime-objects from the standard library. Timestamps can then be converted to datetime64.

date_from = pd.Timestamp("2016-09-26 16:00").to_datetime64()
date_to = pd.Timestamp("2016-09-26 18:00").to_datetime64()
hv.Curve(hvset.select(time=(date_from, date_to)))

Using Timestamp introduces a dependency to pandas however.

Slicing using numpy

datetime64-objects can be constructed using numpy directly.

import datetime

date_from = np.datetime64("2016-09-26T16:00")
date_to = np.datetime64(datetime.datetime(2016, 9, 26, 18, 0))
hv.Curve(hvset.select(time=(date_from, date_to)))

The constructor however seems to be limited to ISO-8601 datestrings and datetime objects. The latter could be used to allow for more liberal date descriptions using strptime. It is worth noting that while the datetime64-objects of numpy have nanosecond accuracy, to my understanding, the datetime-objects from the standard library only have microsecond accuracy.

Proposed changes

Since datetime has a nice string representation, output of time dimensions works nicely out of the box. Things only become tricky when specifying a specific time to do sampling and slicing in a holoviews dataset. Ideally, the syntax would be similar to the way pandas handles things in the sense that everywhere I would need to specify a datetime64-object, I can also give some representation of a time stamp which can be parsed to a datetime64-object. So instead of

date_from = pd.Timestamp("2016-09-26 16:00").to_datetime64()
date_to = pd.Timestamp("2016-09-26 18:00").to_datetime64()
hvset.select(time=(date_from, date_to))

I would like to be able to just specify

hvset.select(time=("2016-09-26 16:00", "2016-09-26 18:00"))
@philippjfr
Copy link
Member

philippjfr commented Sep 26, 2016

I believe there are three places we'd need the implementation:

  1. For the column based data interfaces the implementation would live on: holoviews.core.data.interface.Interface.select_mask

  2. For grid based data interfaces we'd need an implementation in holoviews.core.data.grid.GridInterface.key_select_mask

  3. For NdMapping types (i.e. HoloMap, GridSpace, NdOverlay and NdLayout) we'd need an implementation in holoviews.core.ndmapping.NdMapping._generate_conditions.

All three could probably share one utility however.

@jlstevens
Copy link
Contributor

jlstevens commented Sep 26, 2016

@mrksr Thanks for the nicely laid out proposal - it looks good to me!

A few quick comments:

  • Slicing on anything that supports an ordering should work, but how often is this useful when working with strings in practice?
  • Would it be worth having some sort of 'time' type to specify on the dimension to enable this slicing behavior?
  • Or would we try to detect when strings look like timestamps - or is this too much magic?

I also agree with Philipp that if this functionality can be encapsulated within a single utility, we should do that.

@philippjfr philippjfr added this to the v1.7.0 milestone Oct 25, 2016
@philippjfr philippjfr modified the milestones: v2.0, v1.7.0 Mar 15, 2017
@philippjfr
Copy link
Member

Or would we try to detect when strings look like timestamps - or is this too much magic?

We can check whether the data type is of type hv.util.datetime_type and upcast strings in that case. Implementing this actually seems rather straightforward if we rely on pandas to do the string->datetime conversion. I think we already decided to recommend pandas more generally, so I'd be okay with supporting this feature only when pandas is installed. If others agree I'd suggest we move this forward to the v1.10 milestone.

@jlstevens
Copy link
Contributor

I agree we are relying on pandas more and more so that proposal seems reasonable to me, especially if pandas can do most of the hard work for us.

@philippjfr philippjfr modified the milestones: v2.0, v1.11 Mar 19, 2018
@philippjfr philippjfr modified the milestones: v1.11.0, v1.11.x Dec 27, 2018
@philippjfr philippjfr modified the milestones: v1.11.x, v1.12.0 Mar 22, 2019
@philippjfr philippjfr modified the milestones: v1.12.0, v1.12.x Apr 22, 2019
@philippjfr philippjfr modified the milestones: v1.12.x, v1.13.0 Sep 22, 2019
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants