Skip to content

Time zones / Time references #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bmu opened this issue Apr 4, 2015 · 30 comments · Fixed by #135
Closed

Time zones / Time references #47

bmu opened this issue Apr 4, 2015 · 30 comments · Fixed by #135

Comments

@bmu
Copy link
Contributor

bmu commented Apr 4, 2015

I think we should make sure, that time zones, other time references are used correctly.
We can have time stamps given as UTC, really related to a time zone, local time (without DST), mean solar time, true solar time, ...? This is an everyday problem from my experience, so we should be able to handle all of these formats at least in the future.
For now it could be ok to define, that all times should be given as UTC and leave all time conversion issue to the user.

As pandas has support for Julian dates since 0.14 (this is the pull request). I was thinking about creating a pull request to implement to_solar_time in pandas. However this is not easy to implement and I am not sure, if they accept it. But this way all time conversion would be in pandas an we could implement thinks based on pandas.

@bmu bmu added this to the 0.2 milestone Apr 4, 2015
@wholmgren
Copy link
Member

I'm all for being helpful with timezones. This was my motivation for adding a tz attribute to the Location class. The solarposition module (with some functions in tools) has an attempt at using tz knowledge from either the DatetimeIndex or the Location object. Do these fit into your scope or do you want to do something different? I did think that it would be nice if the Location constructor could use a web service to determine a IANA timezone from a lat/lon. (Or maybe even a lat/lon/tz from a city/state/country search, but that's another topic.)

Guaranteeing that time zones are handled correctly is hard unless you're very restrictive about it. My inclination is to leave the vast majority of the work to pandas. All localized times in pandas are stored as utc internally, so we can and should take advantage of that (as the description above does).

We could require that all times be localized, or we could throw warnings whenever we detect that people are using non-localized times.

Concerning solar time... I just rely on the zenith and azimuth calculations for a given lat/lon. Is there an application for solar time other than for the actual calculation of those quantities of interest? I don't know how pandas would reasonably incorporate something that requires a specific lat/lon. Maybe we're talking about two different quantities? I think of solar time as the local time shifted to make the minimum zenith occur at 12:00 noon.

@bmu
Copy link
Contributor Author

bmu commented Apr 6, 2015

I mean true or apparent solar time, so I think we talk about the same quantities. It is often useful to plot some quantities against this time e.g. to detect the azimuth angle of a system or to detect misalignments between irradiance sensor and modules, ...
So I think it would be useful.

I would also like to rely on pandas for time conversion, as it is the best python implementation I am aware of. However I may be difficult to detect the exact format that a user is assigning to a function.
So perhaps we should extend the tz keyword (or whatever it's name should be) to except not only real time zone names, but also something like utc+8 for a time format without daylight saving time, tst for true solar time, mst for mean solar time ...

As a default I think utc would be better than a localized time.

@jforbess
Copy link
Contributor

jforbess commented Apr 6, 2015

Agreed on using pandas and default UTC, and I think one critical component
of having a successful time implementation for others will be a tutorial
covering various conversions and manipulations.

I am still trying to wrap my head around how true solar time and mean solar
time interact with data collected from a system, but I will get there.

On Sun, Apr 5, 2015 at 11:51 PM, bmu notifications@github.com wrote:

I mean true or apparent solar time
http://en.wikipedia.org/wiki/Solar_time, so I think we talk about the
same quantities. It is often useful to plot some quantities against this
time e.g. to detect the azimuth angle of a system or to detect
misalignments between irradiance sensor and modules, ...
So I think it would be useful.

I would also like to rely on pandas for time conversion, as it is the best
python implementation I am aware of. However I may be difficult to detect
the exact format that a user is assigning to a function.
So perhaps we should extend the tz (or whatever it's name should be) to
ecept not only real time zone names, but also something like utc+8 for a
time format without daylight saving time, tst for true solar time, mst
for mean solar time ...

As a default I think utc would be better than a localized time.


Reply to this email directly or view it on GitHub
#47 (comment).

@wholmgren
Copy link
Member

That sounds like a good use case. solarposition.ephemeris does return solar time, although I did not test this output in my recent PR. Pyephem has a sidereal_time method, so we can add solar time to solarposition.pyephem also. Hopefully we can extract this from the other SPA functions as well.

However I may be difficult to detect the exact format that a user is assigning to a function.

I'm not sure what you mean by this. My thought is that you can always do one of these:

  1. Use the timezone specified on a DatetimeIndex or a Location. (Possibly raise an error if they do not agree.)
  2. Assume UTC. (Possibly raise a warning that we're assuming UTC.)

I don't think there can be any ambiguity because the DatetimeIndex and Location constructors raise errors if the user inputs an timezone that pytz doesn't know about. pytz.FixedOffset timezones can do something like your utc+8 suggestion:

pd.DatetimeIndex(['2015-1-1T00']).tz_localize(pytz.FixedOffset(120))

<class 'pandas.tseries.index.DatetimeIndex'>
[2015-01-01 00:00:00+02:00]
Length: 1, Freq: None, Timezone: pytz.FixedOffset(120)

We could consider making a rule that all timezones must always be specified (even if it's UTC).

tst for true solar time, mst for mean solar time

Interesting idea; a correct and consistent implementation sounds hard. (mst is Mountain Standard Time in the US.)

In any case, I will improve the Location docstring and make a tutorial. We could also make a page on the rtd docs just for timezones. I will also change the Location default tz to be UTC.

@bmu
Copy link
Contributor Author

bmu commented Apr 6, 2015

I think, we misunderstand each other a little bit until now; my English may be the reason ;-)

My use cases include up to now:

  • irradiance data provided by a national weather service using true solar time as timestamp
  • data from our own monitoring system using local time (without daylight saving)
  • data from utilities most often localized (including daylight saving)
  • data from meteonorm, local time (I think)
  • data from GeoModel, UTC
  • "SatelLight" data, local time

So I need

  1. to convert any of these to any defined standard, using tools provided by pvlib or
  2. a keyword that lets our functions know, what kind of time stamp is used and the conversion to this standard is performed within the function (should be utc in any case I think)

Hope this is understandable ;-)

@jforbess
Data collected from a system is usually stored with a local time stamp, from my experience.
However, sometimes I do not now the azimuth of the system or argue that something is wrong with the alignment of the irradiance sensor.
If you plot your measurements, lets say power and irradiance against local time you won't see a clear peak at noon (if the system is oriented towards equator), but have a scatter due to the equation of time (about +- 15 minutes). If you plot it against true solar time you have a clear peak for both quantities at noon. If not, there is something from either with the alignment of the irradiance sensor or the azimuth of the system is not oriented towards equator.
This is only one use case, there are other examples were true solar time is useful.

@jforbess
Copy link
Contributor

jforbess commented Apr 6, 2015

Yes, those cases are generally what I would assume, though in my experience
irradiance data from a national weather service is also in local time, not
true solar time. I am not sure I ever see anything in true solar time
except outputs from very specialized models, like the example you cited,
and in that example, I might compare daily peaks to daily model peaks in
local time, which would look for alignment at 12:35 or whenever the peak is
expected in local time.

From my past experience, using UTC internally usually ends up resulting in
the fewest conversions, since there will always be some. :)

On Mon, Apr 6, 2015 at 10:45 AM, bmu notifications@github.com wrote:

I think, we misunderstand each other a little bit until now; my English
may be the reason ;-)

My use cases include up to now:

  • irradiance data provided by a national weather service using true
    solar time as timestamp
  • data from our own monitoring system using local time (without
    daylight saving)
  • data from utilities most often localized (including daylight saving)
  • data from meteonorm, local time (I think)
  • data from GeoModel, UTC
  • "SatelLight" data, local time

So I need

  1. to convert any of these to any defined standard, using tools
    provided by pvlib or
  2. a keyword that lets our functions know, what kind of time stamp is
    used and the conversion to this standard is performed within the function
    (should be utc in any case I think)

Hope this is understandable ;-)

@jforbess https://github.com/jforbess
Data collected from a system is usually stored with a local time stamp,
from my experience.
However, sometimes I do not now the azimuth of the system or argue that
something is wrong with the alignment of the irradiance sensor.
If you plot your measurements, lets say power and irradiance against local
time you won't see a clear peak at noon (if the system is oriented towards
equator), but have a scatter due to the equation of time (about +- 15
minutes). If you plot it against true solar time you have a clear peak for
both quantities at noon. If not, there is something from either with the
alignment of the irradiance sensor or the azimuth of the system is not
oriented towards equator.

This is only one use case, there are other examples were true solar time
is useful.


Reply to this email directly or view it on GitHub
#47 (comment).

@wholmgren
Copy link
Member

These use cases are helpful and cover a pretty broad range. I think that properly used IANA timezones (the pytz/pandas convention) can cover all except for solar time. For example, here in the US we have US/Mountain (DST aware) and MST (no-DST):

pd.DatetimeIndex(['2015-4-1T00']).tz_localize('US/Mountain')

<class 'pandas.tseries.index.DatetimeIndex'>
[2015-04-01 00:00:00-06:00]
Length: 1, Freq: None, Timezone: US/Mountain


pd.DatetimeIndex(['2015-4-1T00']).tz_localize('MST')

<class 'pandas.tseries.index.DatetimeIndex'>
[2015-04-01 00:00:00-07:00]
Length: 1, Freq: None, Timezone: MST

Note the difference in the UTC offset between the two.

So long as I specify times consistent with the timezone, pandas will use the same integer to represent them:

# US/Mountain
pd.DatetimeIndex(['2015-4-1T01']).tz_localize('US/Mountain').astype(int)

array([1427871600000000000])

# MST
pd.DatetimeIndex(['2015-4-1T00']).tz_localize('MST').astype(int)

array([1427871600000000000])

# UTC
pd.DatetimeIndex(['2015-4-1T07']).tz_localize('UTC').astype(int)

array([1427871600000000000])

# No specification
pd.DatetimeIndex(['2015-4-1T07']).astype(int)

array([1427871600000000000])

So I think we already have all of the tools we need, and the Location objects and the solarposition module already make use of these tools.

On solar time... maybe I see part of the remaining problem now. We've been talking about a way to go from a UTC-based timestamp to solar time (e.g. solarposition.ephemeris), but you want to be able to do the reverse. That seems harder. Could be doable with interpolation. Maybe @alorenzo175 knows how to do it exactly.

@bmu
Copy link
Contributor Author

bmu commented Apr 6, 2015

Ok, I was aware of the pandas time conversion / localization, but not the non-DST ones. I agree, that in this case most of the tools are available. For the conversion from true solar time to UTC one needs the time difference of mean solar time to utc (depending on longitude) and the equation of time, should also be included in the solar position algorithms.

What about the handling of tz-naive timestamps? Do I have to localize before up to now?

@bmu
Copy link
Contributor Author

bmu commented Apr 6, 2015

@jforbess: just ask the German weather service ;-)

the other comment: you won't have a peak at 12:35 if you plot one year data against local time. the peak shifts from 12:20 to 12:50 due to equation of time. and if you want to detect small sensor misalignments, this is a problem.

@bmu
Copy link
Contributor Author

bmu commented Apr 6, 2015

There are other use cases (see e.g. this paper, section 5.1.).

However, I'm not sure if it should be the first priority, as you seem to see no need for true solar time, and maybe most of the users also do not need it.

@wholmgren
Copy link
Member

The pandas integer representation of a tz-naive timestamp is the same as that for a UTC timestamp (see above). Is that what you meant?

You've convinced me that including solar time in the solarposition outputs is high priority, but the converter from solar time to UTC would be lower on my priority list.

@jforbess
Copy link
Contributor

jforbess commented Apr 6, 2015

Yes, I'm not saying that true solar time isn't a key concept, I am just
saying that it's a lower priority to be able to convert between it and UTC.

Though I suspect that the large German operations performance community
would disagree, given @bmu's insight regarding their dataset.

On Mon, Apr 6, 2015 at 12:03 PM, Will Holmgren notifications@github.com
wrote:

The pandas integer representation of a tz-naive timestamp is the same as
that for a UTC timestamp (see above). Is that what you meant?

You've convinced me that including solar time in the solarposition
outputs is high priority, but the converter from solar time to UTC would be
lower on my priority list.


Reply to this email directly or view it on GitHub
#47 (comment).

@bmu
Copy link
Contributor Author

bmu commented Apr 6, 2015

Ok, I'm fine with a lower priority for the reverse calculation (not sure about "the large German operations performance community", though)

@bmu
Copy link
Contributor Author

bmu commented Apr 6, 2015

@wholmgren: I had a look on the code (especially tools.localize_to_utc) and now it is clearer what happens. Should have done this before, sorry.

But I think something in the documentation on this topic would be useful, as not all users will be aware of the time zone handling.

@jforbess
Copy link
Contributor

jforbess commented Apr 6, 2015

Just a reminder to myself and others as to the huge install base outside my
experience, and you know, the largest installed base in the world.

On Mon, Apr 6, 2015 at 12:18 PM, bmu notifications@github.com wrote:

Ok, I'm fine with a lower priority for the reverse calculation (not sure
about "the large German operations performance community", though)


Reply to this email directly or view it on GitHub
#47 (comment).

@bmu
Copy link
Contributor Author

bmu commented Apr 7, 2015

After some sleep, I see my initial confusion: From my point of view, the time zone is related to the data, not to the location. Maybe it is good to have information on the (legal) time zone of a location, but this is not necessarily an indicator for the time stamp used in e.g. measurement files (and I don't know, if this is clear for a user in general).
That's why I was asking for a time zone keyword e.g. for the solar position functions. Maybe this keyword could default to None or infer or ... (in this case it must be a tz-aware time index). If the time index is tz-naive, the user can provide the time reference.
Maybe we could also use the time zone of the location class, but than it should be clearer that this should be the time zone, that is used for the time stamps.

@wholmgren
Copy link
Member

My thought is to go in the other direction. Make the solar position methods only accept tz localized times, raise an error with naive times, and ignore the location object's tz attribute. This makes it so that there is only one way to do it and the user must be very explicit about it. If you have data, then it was recorded with a specific time convention (UTC, Europe/Berlin, US/Arizona) and I think the very first thing you should do when you import that data is to use tz_localize. As soon as you tz_localize, pandas starts treating it as UTC under the hood, you can convert it to whatever viewing representation you want, and there can be no mistakes.

@bmu
Copy link
Contributor Author

bmu commented Apr 7, 2015

I think this could be right, but I didn't used to do it this way. Time zone conversion was complicated before pandas, should be possible today for a user.

@jforbess
Copy link
Contributor

I just discovered that pandas doesn't seem to behave well in one situation: I have data from PVsyst (no DST) that I am trying to load into pvlib to compare to a pvlib model. I am trying to align the timestamps, but pandas doesn't seem capable of localizing timestamps without applying DST. (I'd like to just specify UTC-5, instead of 'US/Eastern'.) I asked a question about this on stackexchange because it seems to be more of a pandas issue than pvlib issue, but perhaps people here have solved this problem already? If not, it's something that we can't rely on pandas for, though perhaps support is coming in 0.17.

@wholmgren
Copy link
Member

I think you want to localize with 'EST' or pytz.FixedOffset(-5*60).

I suppose I should finish writing the pvlib timezone tutorial.

@jforbess
Copy link
Contributor

I checked 'EST', and it still applied the UTC-4 to the periods between March and November, but the pytz.FixedOffset worked great. I had something in mind like that, but couldn't quite figure out how pytz and tz_localize interacted. Thanks!

@wholmgren
Copy link
Member

@bmu @jforbess I reread this thread and I'm not sure what, if anything, is needed to close this issue. Can we resolve it with better documentation or do we need to change the code?

@jforbess
Copy link
Contributor

I think there were two potential actions:

  1. Require timestamped data to be localized. I don't know that we agreed
    whether this was a good requirement or not. I view it as moderately
    positive, because it will force all handling of timezones to be clearer.
    But a potential drawback is when I somehow find myself using a Central
    timezone for data in an Eastern location because of how the data was
    captured originally. (Probably because it wasn't clear whether the data was
    captured as interval beginning or interval ending, to use PVsyst's
    nomenclature.)

  2. Provide a mapping to true solar time. I believe this is a desirable
    feature, but not likely to be part of the 0.2 release.

Finally, I do think that better documentation can cover item 1 if others
don't prefer requiring localization.

On Sun, Jun 21, 2015 at 12:36 PM, Will Holmgren notifications@github.com
wrote:

@bmu https://github.com/bmu @jforbess https://github.com/jforbess I
reread this thread and I'm not sure what, if anything, is needed to close
this issue. Can we resolve it with better documentation or do we need to
change the code?


Reply to this email directly or view it on GitHub
#47 (comment).

@bmu
Copy link
Contributor Author

bmu commented Jun 22, 2015

I agree with @jforbess in both points. But I think we can shift both of them to a later release.

@wholmgren
Copy link
Member

Ok, thanks for the clarification. I am going to mark this issue as 0.3 and think about it again after this release is out.

@wholmgren wholmgren modified the milestones: 0.3, 0.2 Jun 25, 2015
@wholmgren
Copy link
Member

6 months later... the solarposition.py module in PR #93 now requires that timestamped data be localized or it will be assumed to be UTC time. See the diff.

@wholmgren
Copy link
Member

I think that the new code and the new documentation can close this issue. Here's the new proposed documentation:

http://wholmgren-pvlib-python-new.readthedocs.org/en/contributing/timetimezones.html

Please provide comments in #135, if you have them.

Assuming no objections, I will close this when #135 is merged. A new issue could be created for adding true solar time to more functions, if desired.

@bmu
Copy link
Contributor Author

bmu commented Mar 16, 2016

Just an idea:
We could open a milestone for the Santa Clara pvlib sprint in May and move such issues to this milestone. I will be there, too.

Am 15. März 2016 20:55:40 MEZ, schrieb Will Holmgren notifications@github.com:

I think that the new code and the new documentation can close this
issue. Here's the new proposed documentation:

http://wholmgren-pvlib-python-new.readthedocs.org/en/contributing/timetimezones.html

Please provide comments in #135, if you have them.

Assuming no objections, I will close this when #135 is merged. A new
issue could be created for adding true solar time to more functions, if
desired.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#47 (comment)

Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

@dacoex
Copy link
Contributor

dacoex commented Mar 16, 2016

We could open a milestone for the Santa Clara pvlib sprint in May and move such issues to this >milestone. I will be there, too.

Regarding Santa Clara pvlib sprint:
I received an invitation. Thanks for that!
But will not be able to attend.
If there is a possibility to participate on discussions and sprints remotely I would be happy to do so.

@wholmgren
Copy link
Member

@bmu a Santa Clara milestone is good idea and it's great to hear that you'll be able to attend. Did you rsvp to Josh Stein? I did not see your name on a recent list of attendees.

@dacoex and others: We don't yet know exactly what the agenda will be, but I expect that there will be a way to contribute remotely. We'll still be using GitHub, of course.

I'll make a milestone and a new issue for further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants