Skip to content

Conversation

mpharrigan
Copy link
Collaborator

spun off from #5152

It's super important when running experiments to keep track of when things happen.

cc @maffoo

@mpharrigan mpharrigan requested a review from maffoo April 20, 2022 18:27
@mpharrigan mpharrigan requested review from a team, cduck and vtomole as code owners April 20, 2022 18:27
@CirqBot CirqBot added the size: M 50< lines changed <250 label Apr 20, 2022
@mpharrigan mpharrigan mentioned this pull request Apr 20, 2022
Comment on lines 392 to 401
if o.tzinfo is None or o.tzinfo.utcoffset(o) is None:
# Otherwise, the deserialized object may change depending on local timezone.
raise TypeError(
"Can only serialize 'aware' datetime objects with `tzinfo`. "
"Consider using e.g. `datetime.datetime.now(tz=datetime.timezone.utc)`"
)

if o.tzinfo != datetime.timezone.utc:
# Saves us the trouble of having to serialize the timezone.
raise TypeError("Can only serialize UTC timestamps.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be necessary, because the timestamp() method takes into account the timezone (or the locale timezone for naive datatimes) to convert to a unix epoch timestamp (which itself has no timezone).

In [12]: now, now_utc = datetime.datetime.now(), datetime.datetime.now(tz=datetime.timezone.utc)

In [13]: now
Out[13]: datetime.datetime(2022, 4, 20, 11, 52, 22, 976460)

In [14]: now_utc
Out[14]: datetime.datetime(2022, 4, 20, 18, 52, 22, 976471, tzinfo=datetime.timezone.utc)

In [15]: now.timestamp()
Out[15]: 1650480742.97646

In [16]: now_utc.timestamp()
Out[16]: 1650480742.976471

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if you assert now == now_utc it will fail

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if you had

ts = now().in_my_local_timezone()
assert ts == read_json(to_json(ts))

the assertion would fail, even though you're still referring to the correct point in time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While round trip equality is nice, I think datetime is generally broken with how it handles naive vs tz-aware instances and so we shouldn't expect roundtrip equality (even two tz-aware datetime instances with the same timestamp in different timezones do not compare equal). So I suggest we support serializing any datetime by getting its timestamp, and then we just have to pick whether to deserialize as naive or tz-aware datetimes. I think tz-aware with utc is ok, though I would probably prefer naive for simplicity.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that the idea of specifying the tz in deserialization was to force comparison by timestamp(). In a quick test, it seemed like I could reasonable compare two tz-aware datetimes. Riffing on mpharrigan@'s example above:

>>> ts = datetime.datetime.now(tz=timezone('US/Pacific'))
>>> ts == datetime.datetime.fromtimestamp(ts.timestamp(), tz=datetime.timezone.utc)
True

Also...

>>> ts = datetime.datetime.now()
>>> ts.timestamp() == datetime.datetime.fromtimestamp(ts.timestamp(), tz=datetime.timezone.utc).timestamp()
True

That all seems reasonable to me, so I'd +1 to allowing any input datetime here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. Thanks for investigating @wcourtney . It looks like equality does work for "aware" datetimes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( I removed the utc check during serialization)

# unix timestamps should mean that the deserialized datetime should refer to the
# same point in time but may not satisfy o = read_json(to_json(o)) because the actual
# timezones, and hour fields will not be identical.
return datetime.datetime.fromtimestamp(timestamp, tz=datetime.timezone.utc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm conflicted about whether it's better to use naive datetime or explicit utc timezone here. I generally find that using naive datetimes is simpler, as long as you use timestamp() for any kind of comparison, but I could be convinced otherwise.

In [17]: ts = 1650480742

In [18]: dt = datetime.datetime.fromtimestamp(ts)

In [19]: dt_utc = datetime.datetime.fromtimestamp(ts, tz=datetime.timezone.utc)

In [20]: dt
Out[20]: datetime.datetime(2022, 4, 20, 11, 52, 22)

In [21]: dt_utc
Out[21]: datetime.datetime(2022, 4, 20, 18, 52, 22, tzinfo=datetime.timezone.utc)

In [22]: dt.timestamp()
Out[22]: 1650480742.0

In [23]: dt_utc.timestamp()
Out[23]: 1650480742.0

One big problem with the datetime library IMO is that it has a function datetime.datetime.utcnow() which ideally would be the same as datetime.datetime.now(tz=datetime.timezone.utc) but which in fact returns a "naive" datetime with no tz but whose calendar fields (year, month, day, hour, min, second) agree with the current utc time. I find this incredibly confusing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the docs for utcnow has a red box telling you not to use it.

naive datetimes is simpler, as long as you use timestamp() for any kind of comparison

I figured it might be worth forcing folks to use the explicit thing so they can't accidentally not use timestamp(). I mean: another alternative would be to use timestamp floats instead of datetime.datetime everywhere.

the other problem is that it's really hard to use datetime.__eq__ in a meaningful way with naive datetimes. I guess that falls under "comparison".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More explicit than timestamp floats would be timestamp protos, maybe made easier by proto-plus utilities? I personally prefer that, but my hunch is that it's better for the library to support datetime natively.

For usability of datetime, +1 to forcing users to use .timestamp() by setting the explicit tz.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive datetimes compare fine to other naive datetimes, and I think I'd lean toward using them when deserializing for simplicity. But this is definitely an area where footguns abound. We have switched over to using a Timestamp internally that is just a wrapper around unix epoch timestamps to avoid all these issues with datetime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More explicit than timestamp floats would be timestamp protos, maybe made easier by proto-plus utilities? I personally prefer that, but my hunch is that it's better for the library to support datetime natively.

For usability of datetime, +1 to forcing users to use .timestamp() by setting the explicit tz.

You can call .timestamp() on any datetime instance, whether it has a tz or not, so I don't think that setting it forces users to do the right thing.

I think timestamp protos are going to feel out of place in most python code and we don't currently pull in protobuf as a dependency of cirq-core, so probably shouldn't rely on them unless we want to support time serialization only in cirq-google.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the basis that:

  1. the proto fields use an analog of utc "aware" datetimes
  2. things seem generally less broken if you live entirely in "aware" datetime land

I'd like to go ahead with forcing 'aware' during serialization and always deserializing as "aware" utc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proto uses a timestamp (which can be gotten from naive or aware datetime), so I don't think that carries much weight here. But I'm fine with deserializing as aware and and accepting only aware datetimes (could decide to accept naive datetimes in the future if we want). Thanks for working through all the comments here!

Copy link
Collaborator

@wcourtney wcourtney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through this! I'm a fan of being more explicit, especially WRT time since it's often hard enough to reason about even w/o Python's fun tz handling.

Copy link
Contributor

@maffoo maffoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment about variable names in tests, otherwise LGTM.



def test_basic_time_assertions():
naive_ts = datetime.datetime.now()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we use the suffix _dt instead for datetime instances in these tests? _ts sounds like "timestamp", and I think we should be careful about the distinction between a datetime and a timestamp.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mpharrigan mpharrigan added the automerge Tells CirqBot to sync and merge this PR. (If it's running.) label Apr 21, 2022
@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Apr 21, 2022
@CirqBot
Copy link
Collaborator

CirqBot commented Apr 21, 2022

Automerge cancelled: A required status check is not present.

Missing statuses: ['Coverage check', 'Pytest MacOS (3.7)', 'Pytest MacOS (3.8)', 'Pytest MacOS (3.9)', 'Pytest Ubuntu (3.7)', 'Pytest Ubuntu (3.8)', 'Pytest Ubuntu (3.9)', 'Pytest Windows (3.7)', 'Pytest Windows (3.8)', 'Pytest Windows (3.9)', 'Typescript lint check', 'Typescript tests', 'Typescript tests coverage']

@CirqBot CirqBot removed automerge Tells CirqBot to sync and merge this PR. (If it's running.) front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. labels Apr 21, 2022
@mpharrigan mpharrigan added the automerge Tells CirqBot to sync and merge this PR. (If it's running.) label Apr 21, 2022
@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Apr 21, 2022
@CirqBot CirqBot merged commit ab5a0dc into quantumlib:master Apr 21, 2022
@CirqBot CirqBot removed automerge Tells CirqBot to sync and merge this PR. (If it's running.) front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. labels Apr 21, 2022
tonybruguier pushed a commit to tonybruguier/Cirq that referenced this pull request Apr 22, 2022
spun off from quantumlib#5152 

It's super important when running experiments to keep track of when things happen.

cc @maffoo
rht pushed a commit to rht/Cirq that referenced this pull request May 1, 2023
spun off from quantumlib#5152 

It's super important when running experiments to keep track of when things happen.

cc @maffoo
harry-phasecraft pushed a commit to PhaseCraft/Cirq that referenced this pull request Oct 31, 2024
spun off from quantumlib#5152 

It's super important when running experiments to keep track of when things happen.

cc @maffoo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: M 50< lines changed <250

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants