-
Notifications
You must be signed in to change notification settings - Fork 1.1k
add test for hour_angle, vectorize #599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pvlib/test/test_solarposition.py
Outdated
eot = np.array([-3.935172, -4.117227, -4.026295]) | ||
hours = solarposition.hour_angle(times, longitude, eot) | ||
expected = (-70.682338, 70.72118825000001, 0.000801250) | ||
#expected = (-70.699400,70.512721, 0.0) # can't quite get these, why? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E265 block comment should start with '# '
* also converting times to int before subtracting works better for older pandas versions which were not calculating the timedeltas correctly * remove comment with missing space after hash, add FIXME that explains why the expected values are slightly different than the SPA calculator output
pvlib/solarposition.py
Outdated
tz_info = times.tz | ||
timezone = tz_info.utcoffset(times).total_seconds() / 3600. | ||
hours = (times.astype(np.int64) - times.normalize().astype(np.int64)) / ( | ||
3600. * 1.e9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E126 continuation line over-indented for hanging indent
pvlib/solarposition.py
Outdated
)).total_seconds() / 3600. for t in times]) | ||
timezone = times.tz.utcoffset(times).total_seconds() / 3600. | ||
tz_info = times.tz | ||
timezone = tz_info.utcoffset(times).total_seconds() / 3600. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the expected behavior?
In [15]: times = pd.DatetimeIndex(start='20180101', end='20190101', freq='1s', tz='America/Phoenix')
In [16]: timezone = times.tz.utcoffset(times).total_seconds() / 3600.
In [17]: timezone
Out[17]: -7.466666666666667
I am surprised because 1. I expected -7 and 2. times
is a vector so I expected to get a result for each time (offset could change for DST).
Is this preferable?
In [27]: np.array((times.tz_localize(None).astype(int) - times.astype(int)) / 1e9 / 3600)
Out[27]: array([-7., -7., -7., ..., -7., -7., -7.])
The np.array
wrapper ensures a known type instead of a version-dependent Index/array type.
Note that tz_localize(None)
will always work if we bump min pandas to 0.15, as discussed elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not what I would expect, I took that approach from this SO answer.
TL;DR: let's just use your approach instead
details
Something very weird is going on with this particular America/Phoenix
timezone. PyTz is picking a historical timezone called LMT that was discontinued in 1883 because it is listed first in the transition info list.
>>> import pytz
>>> import datetime
>>> america_phoenix = pytz.timezone('America/Phoenix')
<DstTzInfo 'America/Phoenix' LMT-1 day, 16:32:00 STD>
>>> dt = datetime.datetime.now().replace(tzinfo=america_phoenix)
datetime.datetime(2018, 10, 5, 12, 58, 34, 427000,
tzinfo=<DstTzInfo 'America/Phoenix' LMT-1 day, 16:32:00 STD>)
>>> america_phoenix.utcoffset(dt)
datetime.timedelta(-1, 59520)
The source for DstTzInfo
says that _utcoffset
is returned for utcoffset(dt)
if localized already, otherwise localize first then return the _utcoffset
of the timezone.
>>> america_phoenix._utcoffset
datetime.timedelta(-1, 59520)
And if not localized, then _utcoffset
is set to whatever is the first tz info in the transition info table. In this case that's "LMT" which was UTC -7:28:18 until 1883.
>>> america_phoenix._tzinfos
{(datetime.timedelta(-1, 59520),
datetime.timedelta(0),
'LMT'): <DstTzInfo 'America/Phoenix' LMT-1 day, 16:32:00 STD>,
(datetime.timedelta(-1, 61200),
datetime.timedelta(0),
'MST'): <DstTzInfo 'America/Phoenix' MST-1 day, 17:00:00 STD>,
(datetime.timedelta(-1, 64800),
datetime.timedelta(0, 3600),
'MDT'): <DstTzInfo 'America/Phoenix' MDT-1 day, 18:00:00 DST>,
(datetime.timedelta(-1, 64800),
datetime.timedelta(0, 3600),
'MWT'): <DstTzInfo 'America/Phoenix' MWT-1 day, 18:00:00 DST>}
>>> america_phoenix._transition_info
[(datetime.timedelta(-1, 59520), datetime.timedelta(0), 'LMT'),
(datetime.timedelta(-1, 61200), datetime.timedelta(0), 'MST'),
(datetime.timedelta(-1, 64800), datetime.timedelta(0, 3600), 'MDT'),
(datetime.timedelta(-1, 61200), datetime.timedelta(0), 'MST'),
(datetime.timedelta(-1, 64800), datetime.timedelta(0, 3600), 'MDT'),
(datetime.timedelta(-1, 61200), datetime.timedelta(0), 'MST'),
(datetime.timedelta(-1, 64800), datetime.timedelta(0, 3600), 'MWT'),
(datetime.timedelta(-1, 61200), datetime.timedelta(0), 'MST'),
(datetime.timedelta(-1, 64800), datetime.timedelta(0, 3600), 'MWT'),
(datetime.timedelta(-1, 61200), datetime.timedelta(0), 'MST'),
(datetime.timedelta(-1, 64800), datetime.timedelta(0, 3600), 'MDT'),
(datetime.timedelta(-1, 61200), datetime.timedelta(0), 'MST')]
If the datetime is localized, then PyTz does some smart stuff to figure out which transition info to use, based on the values in _utc_transition_times
and returns the corresponding transition info and timezone info, and replaces tzinfo
in the datetime.
>>> america_phoenix._utc_transition_times
[datetime.datetime(1, 1, 1, 0, 0),
datetime.datetime(1901, 12, 13, 20, 45, 52),
datetime.datetime(1918, 3, 31, 9, 0),
datetime.datetime(1918, 10, 27, 8, 0),
datetime.datetime(1919, 3, 30, 9, 0),
datetime.datetime(1919, 10, 26, 8, 0),
datetime.datetime(1942, 2, 9, 9, 0),
datetime.datetime(1944, 1, 1, 6, 1),
datetime.datetime(1944, 4, 1, 7, 1),
datetime.datetime(1944, 10, 1, 6, 1),
datetime.datetime(1967, 4, 30, 9, 0),
datetime.datetime(1967, 10, 29, 8, 0)]
Since our datetime is 2018-10-05T12:58:34, that's the last index, and we get the the most recent Phoenix timezone from the transition info table.
(datetime.timedelta(-1, 61200), datetime.timedelta(0), 'MST')
which is the key using in the timezone info table and returns the correct timezone
(datetime.timedelta(-1, 61200), datetime.timedelta(0), 'MST'):
<DstTzInfo 'America/Phoenix' MST-1 day, 17:00:00 STD>
When pandas makes a datetime index, I guess it doesn't care about tzinfo
but it does do this with Timestamps
which are the items in the datetime index.
DatetimeIndex
has wrong timezone
>>> import pandas as pd
>>> times = pd.DatetimeIndex(start='20180101', end='20190101', freq='1s', tz='US/Arizona')
>>> times.tz
<DstTzInfo 'US/Arizona' LMT-1 day, 16:32:00 STD>
>>> times.tzinfo # Alias for tz attribute (in PyTz)
<DstTzInfo 'US/Arizona' LMT-1 day, 16:32:00 STD>
Timestamp
has correct timezone
>>> times[0]
Timestamp('2018-01-01 00:00:00-0700', tz='US/Arizona', offset='S')
>>> times[0].tz
<DstTzInfo 'US/Arizona' MST-1 day, 17:00:00 STD>
>>> times[0].tzinfo
<DstTzInfo 'US/Arizona' MST-1 day, 17:00:00 STD>
Also if to_pydatetime()
is called it uses the correct timezone as well.
>>> times.to_pydatetime()
array([ datetime.datetime(2018, 1, 1, 0, 0, tzinfo=<DstTzInfo 'US/Arizona' MST-1 day, 17:00:00 STD>),
datetime.datetime(2018, 1, 1, 0, 0, 1, tzinfo=<DstTzInfo 'US/Arizona' MST-1 day, 17:00:00 STD>),
datetime.datetime(2018, 1, 1, 0, 0, 2, tzinfo=<DstTzInfo 'US/Arizona' MST-1 day, 17:00:00 STD>),
...,
datetime.datetime(2018, 12, 31, 23, 59, 58, tzinfo=<DstTzInfo 'US/Arizona' MST-1 day, 17:00:00 STD>),
datetime.datetime(2018, 12, 31, 23, 59, 59, tzinfo=<DstTzInfo 'US/Arizona' MST-1 day, 17:00:00 STD>),
datetime.datetime(2019, 1, 1, 0, 0, tzinfo=<DstTzInfo 'US/Arizona' MST-1 day, 17:00:00 STD>)], dtype=object)
Anyway, not sure if this is a bug in pandas or not, but who cares? Now I have to go around and fix this everywhere I've used utcoffset()
, bummer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: it's only used 2 places:
Searching 160 files for "utcoffset"
C:\Users\mikm\Projects\pvlib-python\pvlib\clearsky.py:
664 * automatically determines sample_interval
665 * requires a reference clear sky series instead calculating one
666: from a user supplied location and UTCoffset
667 * parameters are controllable via keyword arguments
668 * option to return individual test components and clearsky scaling
C:\Users\mikm\Projects\pvlib-python\pvlib\solarposition.py:
547 # pyephem drops timezone when converting to its internal datetime
548 # format, so handle timezone explicitly here
549: obs.date = ephem.Date(thetime - thetime.utcoffset())
550 sunrise.append(_ephem_to_timezone(rising(sun), time.tz))
551 sunset.append(_ephem_to_timezone(setting(sun), time.tz))
...
1329 """
1330 tz_info = times.tz
1331: timezone = tz_info.utcoffset(times).total_seconds() / 3600.
1332 hours = 1 / (3600. * 1.e9) * (
1333 times.astype(np.int64) - times.normalize().astype(np.int64))
So utcoffset()
is also a Python builtin which is overloaded by PyTz. The difference is that if utcoffset
is called on a Python datetime then it will return the utcoffset of it's tz info called on itself. So in our example above we could have called:
>>> dt.utcoffset()
datetime.timedelta(-1, 59520) # wrong timezone b/c localize() was not used!
Also FYI: there is an inefficient loop in rise_set_transit_ephem
which uses slow to_pydatetime()
and utcoffset()
, but it's probably addressed in #588?
Other than that I think there's only 2 other instances of utcoffset
this one in hour_angle and one in #583. I will fix both. Thanks for catching this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: this is also an example of why we should never replace tzinfo manually, but always use the localize()
method so that the correct timezone from the transition info table is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into it. I've seen this kind of funny behavior with the AZ timezones before but it didn't occur to me that it could be a problem in this function. Some other common timezones have similar behavior (though I don't remember which).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I don't think it's a problem with the function utcoffset()
. Honestly, I think this was my mistake for using utcoffset()
with a sequence of times like pandas.DatetimeIndex
, because there could potentially be a different timezone for each item in the sequence, so it's ambiguous which timezone should be used. Therefore, I think the problem is that the tz
(aka tzinfo
) that's set in the DatetimeIndex
is unreliable and ambiguous, or the problem could be with utcoffset
which should be overloaded to return a list of timezones localized to each item in the sequence.
* ... suggested by @wholmgren (thx!) * utcoffset() is unpredictable when used with pandas datetime indices * only predictable with Python datetime objects or pandas Timestamps * instead replace tzinfo with None to get naive local times, and calculate difference from tz-aware times to get timezones
* also use asarray wrapper at return to ensure consistency * add comments to explain to future maintainers
pvlib/solarposition.py
Outdated
# hours - timezone = times - times.normalized - (naive_times - times) | ||
hrs_minus_tzs = 1 / (3600. * 1.e9) * ( | ||
2 * times.astype(np.int64) - times.normalize().astype(np.int64) | ||
- naive_times.astype(np.int64)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W503 line break before binary operator
or maybe we just leave master in a failing state until a few PRs go through. |
okay, I removed the try-except block , which ha ha didn't work anyway, and we can just wait for pandas min-requirements to be updated, thanks! |
Signed-off-by: Mark Mikofski bwana.marko@yahoo.com
pvlib python pull request guidelines
Thank you for your contribution to pvlib python! You may delete all of these instructions except for the list below.
You may submit a pull request with your code at any stage of completion.
The following items must be addressed before the code can be merged. Please don't hesitate to ask for help if you're unsure of how to accomplish any of the items below:
docs/sphinx/source/api.rst
for API changes.docs/sphinx/source/whatsnew
file for all changes.Brief description of the problem and proposed solution (if not already fully described in the issue linked to above):