-
-
Notifications
You must be signed in to change notification settings - Fork 32k
datetime module has no support for nanoseconds #59648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
As long as computers evolve time management becomes more precise and more granular. print "%.9f" % time.time() I've actual timestamp from the epoch with nanosecond granularity. Thus support for nanoseconds in datetime would really be appreciated |
Vincenzo Ampolo wrote:
I would be interested in an actual use case for this. |
On 07/24/2012 01:28 PM, Marc-Andre Lemburg wrote:
Alice has a dataset with nanosecond granularity. He wants to make a Many python sql libraries, like the one in django e the one in web2py, Google search for "python datetime nanoseconds" shows more than 141k So this is definitively a requested feature. And as soon as technology I imagine something like: import datetime
nano_time = datetime.datetime(year=2012, month=07, day=24, hour=14,
minute=35, second=3, microsecond=53, nanosecond=27) in case you need nanosecond granularity. if you don't need it just skip I can write a patch if some dev can maybe review it. Before someone takes the datetime source code and starts a third part Best Regards,Vincenzo Ampolo |
I believe Marc-Andre was looking for an actual real-world use case rather than a hypothetical one. We discussed this briefly on the irc channel and we think Guido vetoed it on a YAGNI basis (we haven't checked the archives though...) so a real world use case is probably required. |
This is a real use case I'm working with that needs nanosecond precision most OSes let users capture network packets (using tools like tcpdump or Other case is in stock markets. In that field information is timed in The company I work for is in the data networking field, and we use Best Regards, --- |
Are the nanosecond timestamps timestamps or strings? If they are timestamps it's not immediately obvious why you want to convert them to datetime objects, so motivating that would probably help. On the other hand the fact that you have an application that does so is certain an argument for real world applicability. |
Even if accepted this can't get fixed in 2.7, so removing that from versions. |
On 07/24/2012 04:20 PM, R. David Murray wrote:
It depends. When they are exported for example as csv (this can be the Think about a web application. User selects year, month, day, hour, It's basically the same you already do nowadays at microseconds level, I agree with the YAGNI principle and I think that we have a clear Best Regards |
See the PEP-410. |
Vincenzo Ampolo wrote:
Thanks for the two use cases. You might want to look at mxDateTime and use that for your timestamps. |
On Wed, Jul 25, 2012 at 4:17 AM, Marc-Andre Lemburg <report@bugs.python.org> wrote:
No, it does not: >>> import time
>>> t = time.time()
>>> t + 5e-9 == t
True In fact, C double precision is barely enough to cover microseconds: >>> t + 1e-6 == t
False
>>> t + 1e-7 == t
True |
Alexander Belopolsky wrote:
>
> Alexander Belopolsky <alexander.belopolsky@gmail.com> added the comment:
>
> On Wed, Jul 25, 2012 at 4:17 AM, Marc-Andre Lemburg <report@bugs.python.org> wrote:
>> ... full C double precision for the time part of a timestamp,
>> which covers nanoseconds just fine.
>
> No, it does not:
>
>>>> import time
>>>> t = time.time()
>>>> t + 5e-9 == t
> True
>
> In fact, C double precision is barely enough to cover microseconds:
>
>>>> t + 1e-6 == t
> False
>
>>>> t + 1e-7 == t
> True I was referring to the use of a C double to store the time part |
Marc-Andre Lemburg wrote:
>
>> Alexander Belopolsky <alexander.belopolsky@gmail.com> added the comment:
>>
>> On Wed, Jul 25, 2012 at 4:17 AM, Marc-Andre Lemburg <report@bugs.python.org> wrote:
>>> ... full C double precision for the time part of a timestamp,
>>> which covers nanoseconds just fine.
>>
>> No, it does not:
>>
>>>>> import time
>>>>> t = time.time()
>>>>> t + 5e-9 == t
>> True
>>
>> In fact, C double precision is barely enough to cover microseconds:
>>
>>>>> t + 1e-6 == t
>> False
>>
>>>>> t + 1e-7 == t
>> True
>
> I was referring to the use of a C double to store the time part
> in mxDateTime. mxDateTime uses the C double to store the number of
> seconds since midnight, so you don't run into the Unix ticks value
> range problem you showcased above. There's enough room to even store 1/100th of a nanosecond, which may False
>>> x == x + 1e-10
False
>>> x == x + 1e-11
False
>>> x == x + 1e-12
True |
[Roundup's email interface again...]
|
Have a look to this python dev mailing list thread too: http://mail.python.org/pipermail/python-dev/2012-July/121123.html |
I would like to add a real-world use case I have for nanosecond-precision support. I deal with data loggers that are controlled by GPS clocks, and I am writing some processing software in Python that requires the input of high-precision timestamps for calculating clock drifts and offsets. The addition of nanosecond-precision support in datetime would allow me to use this rather than a homebrew solution. |
I would like to add a use case. Control systems for particle accelerators. We have ns, sometimes ps precision on timestamped data acquisitions and we would like to use Python to do calculations. |
Given that struct timespec defined as struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
}; is slowly becoming the prevailing standard to represent time in system interfaces, Python's inability to faithfully store it in a high level object will increasingly become a handicap. People are starting to put nanoseconds in their databases not because they really need such precision, but because this is what they get from their devices and at the collection time cannot do anything "smart". The program that collects the events may simply not have time to do anything other than store raw data, or not have the higher level knowledge of what is the proper rounding. The proper rounding is best to be done at the analysis time and by a program written in a higher level language such as Python. |
For the record, numpy's datetime and timedelta types have theoretical support for attoseconds. |
numpy's datetime64 and timedelta64 types are so utterly broken that I would only recommend studying them as a negative example of how not to design a date-time library. |
A note from Guido, from about 2 years ago: https://mail.python.org/pipermail/python-dev/2012-July/121127.html """ Add pickle, etc. |
According to a comment on top of Include/datetime.h, the internal representation of datetime is 10, not 8 bytes. /* Fields are packed into successive bytes, each viewed as unsigned and
(if you don't trust the comments check the definitions a few lines below) #define _PyDateTime_DATETIME_DATASIZE 10 AFAIK, Python objects are allocated with at least 32-bit alignment, so we have at least 2 unused bytes at the end of each datetime object. Furthermore, out of 24 bits allocated for microseconds, only 20 are used, so nanoseconds can be accommodated by adding a single byte to DATETIME_DATASIZE. |
Le 14/07/2014 21:37, Alexander Belopolsky a écrit :
64 bits, actually, when using obmalloc.c. |
Yup, it's definitely more than 8 bytes. In addition to the comments you quoted, an in-memory datetime object also has a full Python object header, a member to cache the hash code, and a byte devoted to saying whether or not a tzinfo member is present. Guessing Guido was actually thinking about the pickle size - but that's 10 bytes (for a "naive" datetime object). |
No, pickle also comes with an overhead >>> from datetime import *
>>> import pickle
>>> t = datetime.now()
>>> len(pickle.dumps(t))
70 For the present discussion, DATETIME_DATASIZE is the only relevant number because we are not going to change anything other than the payload layout in the datetime object or its pickle serialization. |
@pganssle - let's keep the substantive discussions in the tracker so that they are not lost on github. You wrote: """ It might be good for us to get an explicit "to-do" list of concerns to be addressed before this can be merged. I don't think full nanosecond support is feasible to complete in the remaining weeks, but we can try to add nanoseconds to timedelta only. The mixed datetime + timedelta ops will still truncate, but many time-related operations will be enabled. I would even argue that when nanoseconds precision is required, it is more often intervals no longer than a few days and rarely a specific point in time. |
This may be so, but I think the important part of that question is "what work needs to be done and what questions need to be answered?" If the answer is that we need to make 3 decisions and do the C implementation, that seems feasible to do in under a month. If the answer is that we've got 10 contentious UI issues and we probably want to go through the PEP process, I agree with your assessment of the timing. Regardless, we'll need to know what work needs to be done before we do it...
To be honest, I don't find this very compelling and I think it will only confuse people. I think most people use From the use cases in this thread:
So I don't think there's high enough demand for nanosecond-timedelta on its own that we need to rush it out there before datetime gets it. |
Is there high enough demand for nanoseconds in datetime and time instances? How often nanosecond timestamps contain anything other than 0s or garbage in the last three digits? In my experience, all people want to do with such timestamps is to convert them to something expressed in hours, minutes and seconds rather than just a huge number of seconds and back without loosing the value. A timedelta is almost always a decent replacement for either datetime or time in those cases and sometimes it is even preferable because arithmetically it is closer to numbers. |
This brings me back some times. Sorry if I am not up to date, the issue as I recall from back then was there wasn't even microseconds. |
a nanosecond often translates to about a foot and 5 hours gets you to Pluto. Telemetry is exactly an application where absolute timestamps rarely make any sense. |
In the confines of PTP / IEEE1588, it's actually quite common and can be useful. It's not so much the ns, but the <1us that is missing. |
[Alexander]
One need that we've encountered in real code is simply for compatibility. We have Python code that interacts with a logging web service whose timestamps include nanosecond information. Whether or not nanosecond resolution makes sense for those timestamps is a moot point: that's out of our control. When representing information retrieved from that web service in Python-land, we have a problem. If datetime.datetime had nanosecond precision, then using datetime.datetime to represent the retrieved values would be a no-brainer. As it is, we face a choice between:
None of those choices are terrible, but none of them are particularly palatable compared with using a standard library solution. (FWIW, we went with option 2, returning nanoseconds since the Unix epoch as an int.) |
I also have a use case that would benefit from nanosecond resolution in Python's datetime objects, that is, representing and querying the results of clock_gettime() in a program trace. On modern Linuxes with a vDSO, clock_gettime() does not require a system call and completes within a few nanoseconds. So Python's datetime objects do not have sufficient resolution to distinguish between adjacent calls to clock_gettime(). This means that, like Mark Dickinson above, I have to choose between using datetime for queries (which would be convenient) and accepting that nearby events in the trace may be indistinguishable, or implementing my own datetime-like data structure. |
More than 10 years have this been open. :-) strptime should be able to parse it, please harmonize strftime from other implementations with python strptime: Most of our rest-apis (500+) are Java spring-boot 7 digits, Go (9digits on linux and 7digits on windows) I do this this conversion almost daily: >>> t = '2023-01-05T09:45:41.0877981+01:00' # this example taken from K6 json output: https://k6.io/docs/results-output/real-time/json/ a Go commercial program
>>> datetime.datetime.strptime(t, "%Y-%m-%dT%H:%M:%S.%f%z")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Program Files\Python310\lib\_strptime.py", line 568, in _strptime_datetime
tt, fraction, gmtoff_fraction = _strptime(data_string, format)
File "C:\Program Files\Python310\lib\_strptime.py", line 349, in _strptime
raise ValueError("time data %r does not match format %r" %
ValueError: time data '2023-01-05T09:45:41.0877981+01:00' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'
>>> t1 = t[:-7] + t[-6:]
>>> datetime.datetime.strptime(t1, "%Y-%m-%dT%H:%M:%S.%f%z")
datetime.datetime(2023, 1, 5, 9, 45, 41, 87798, tzinfo=datetime.timezone(datetime.timedelta(seconds=3600))) |
I have this same problem. I changed _strptime.py file in lines 188 and 424, to include format n for nanoseconds (1 until 9 digits): Aproximately line 188 : Aproximately line 424: elif group_key == 'f': I´ll aprecciate if someone could update _srtptime.py with this improvements. |
I also have the same problem. My use-case is a generic Timer I am implementing for profiling data-processing code for high-performance Machine Learning modeling. This is purely in Python.
For certain use-cases a piece of code runs very fast (few hundred nanoseconds) but needs to run hundreds of billions of times, example dict lookup when deduplicating Amazon product embeddings. So, timing it effectively is useful. It's annoying that I can't just use I feel this could be implemented really easily in |
Another use case: polars is currently using datetime.timedelta() objects to represent values of the This is a situation that seems similar to @mdickinson where using the std lib would be a no-brainer if it did the job, but unfortunately it does not here and there isn't any great solution, especially if this is part of the public API of a library. |
Would this be a potential solution? import datetime
from datetime import time
from operator import index
from typing import overload
class Time(time):
@overload
def __new__(cls, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0): ...
@overload
def __new__(cls, hour=0, minute=0, second=0, microsecond=0, nanosecond=0, tzinfo=None, *, fold=0): ...
def __new__(cls, hour=0, minute=0, second=0, microsecond=0, nanosecond=0, tzinfo=None, *, fold=0):
if (nanosecond is None or isinstance(nanosecond, datetime.tzinfo)) and tzinfo is None:
# Make constructor backwards compatible
nanosecond, tzinfo = 0, nanosecond
nanosecond = index(nanosecond)
if not 0 <= nanosecond <= 999:
raise ValueError('nanosecond must be in 0..999', nanosecond)
self = super().__new__(cls, hour, minute, second, microsecond, tzinfo, fold=fold)
self._nanosecond = nanosecond
return self
@property
def nanosecond(self):
"""nanosecond (0-999)"""
return self._nanosecond Note This currently doesn't raise an error: t = Time(nanosecond=None) Is it worth it to do the parsing manually, or is a warning of the type checker good enough? |
Sadly passing the extra arguments of class timedelta:
if sys.version_info >= (3, 16):
def __new__(days=0, seconds=0, microseconds=0, nanoseconds=0, *, weeks=0, hours=0, minutes=0, milliseconds=0): ...
elif sys.version_info >= (3, 14):
def __new__(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0, *, nanoseconds=0): ...
else:
def __new__(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0): ... |
@vstinner, do you think this approach is reasonable? |
Changing the signature in Python 3.16 to put nanoseconds instead of milliseconds is a bad idea. I don't think that positional arguments can ever change in the datetime API. |
Yeah, that would break >1.1k files without raising an error: Let's only do it for |
It seems like it's not going to be solved soon. Check out the |
Uh oh!
There was an error while loading. Please reload this page.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: