-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time.clock_gettime_ns() is too slow #111482
Comments
"too slow" for what? |
@pochmann too slow comparing to time.time() = gettimeofday(). For example, CLOCK_REALTIME_COARSE by definition should be faster (but coarse). In fact, it's not faster in Python. I did a glimpse on the implementation. For reasons that I don't understand, there are many useless wrappers around simple things. Why not to do this: static PyObject *
time_clock_gettime_ns(PyObject *self, PyObject *args)
{
int clk_id;
if (!PyArg_ParseTuple(args, "i:clock_gettime", &clk_id)) {
return NULL;
}
struct timespec ts;
if (clock_gettime((clockid_t)clk_id, &ts)) {
PyErr_SetFromErrno(PyExc_OSError);
return NULL;
}
return PyLong_FromLongLong(ts.tv_sec * 1000000000ll + ts.tv_nsec);
} I beat, it will be faster than current implementation, although I did not check yet. |
I'm not sure what this issue is about. If you want to propose a change, go ahead. |
I have checked, my optimizations give nothing, so closing the issue :( |
Use Argument Clinic for time.clock_gettime() and time.clock_gettime_ns() functions.
Use Argument Clinic for time.clock_gettime() and time.clock_gettime_ns() functions. Benchmark: import time import pyperf runner = pyperf.Runner() runner.timeit( 'clock_gettime_ns(CLOCK_MONOTONIC_COARSE)', setup='import time; clock_gettime_ns=time.clock_gettime_ns; CLOCK_MONOTONIC_COARSE=6', stmt='clock_gettime_ns(CLOCK_MONOTONIC_COARSE)') Result on Linux with CPU isolation: Mean +- std dev: [ref] 134 ns +- 1 ns -> [change] 55.7 ns +- 1.4 ns: 2.41x faster
Use Argument Clinic for time.clock_gettime() and time.clock_gettime_ns() functions. Benchmark on time.clock_gettime_ns(): import time import pyperf runner = pyperf.Runner() runner.timeit( 'clock_gettime_ns(CLOCK_MONOTONIC_COARSE)', setup='import time; clock_gettime_ns=time.clock_gettime_ns; CLOCK_MONOTONIC_COARSE=6', stmt='clock_gettime_ns(CLOCK_MONOTONIC_COARSE)') Result on Linux with CPU isolation: Mean +- std dev: [ref] 134 ns +- 1 ns -> [change] 55.7 ns +- 1.4 ns: 2.41x faster
I reopen the issue: I wrote PR #111641 which makes clock_gettime() and clock_gettime_ns() up to 2x faster by using a faster calling convention (METH_VARARGS => METH_O). |
Use Argument Clinic for time.clock_gettime() and time.clock_gettime_ns() functions. Benchmark on time.clock_gettime_ns(): import time import pyperf runner = pyperf.Runner() runner.timeit( 'clock_gettime_ns(CLOCK_MONOTONIC_COARSE)', setup='import time; clock_gettime_ns=time.clock_gettime_ns; CLOCK_MONOTONIC_COARSE=6', stmt='clock_gettime_ns(CLOCK_MONOTONIC_COARSE)') Result on Linux with CPU isolation: Mean +- std dev: [ref] 134 ns +- 1 ns -> [change] 55.7 ns +- 1.4 ns: 2.41x faster
Use Argument Clinic for time.clock_gettime() and time.clock_gettime_ns() functions. Benchmark on time.clock_gettime_ns(): import time import pyperf runner = pyperf.Runner() runner.timeit( 'clock_gettime_ns(CLOCK_MONOTONIC_COARSE)', setup='import time; clock_gettime_ns=time.clock_gettime_ns; CLOCK_MONOTONIC_COARSE=6', stmt='clock_gettime_ns(CLOCK_MONOTONIC_COARSE)') Result on Linux with CPU isolation: Mean +- std dev: [ref] 134 ns +- 1 ns -> [change] 55.7 ns +- 1.4 ns: 2.41x faster
#111641 was merged. Does it need backporting? If not, let's close this issue. |
I dislike backporting optimizations. Better performance is not worth it compared to the risk of regression. I close the issue. Thanks @socketpair for the bug report. |
clockid_t is defined as long long on AIX.
clockid_t is defined as long long on AIX.
Use Argument Clinic for time.clock_gettime() and time.clock_gettime_ns() functions. Benchmark on time.clock_gettime_ns(): import time import pyperf runner = pyperf.Runner() runner.timeit( 'clock_gettime_ns(CLOCK_MONOTONIC_COARSE)', setup='import time; clock_gettime_ns=time.clock_gettime_ns; CLOCK_MONOTONIC_COARSE=6', stmt='clock_gettime_ns(CLOCK_MONOTONIC_COARSE)') Result on Linux with CPU isolation: Mean +- std dev: [ref] 134 ns +- 1 ns -> [change] 55.7 ns +- 1.4 ns: 2.41x faster
clockid_t is defined as long long on AIX.
Use Argument Clinic for time.clock_gettime() and time.clock_gettime_ns() functions. Benchmark on time.clock_gettime_ns(): import time import pyperf runner = pyperf.Runner() runner.timeit( 'clock_gettime_ns(CLOCK_MONOTONIC_COARSE)', setup='import time; clock_gettime_ns=time.clock_gettime_ns; CLOCK_MONOTONIC_COARSE=6', stmt='clock_gettime_ns(CLOCK_MONOTONIC_COARSE)') Result on Linux with CPU isolation: Mean +- std dev: [ref] 134 ns +- 1 ns -> [change] 55.7 ns +- 1.4 ns: 2.41x faster
clockid_t is defined as long long on AIX.
Feature or enhancement
Proposal:
Actually, this is a performance bug. Not a feature/enhancement.
clock_gettime_ns()
does not use floating point operations and should be faster than oldgettimeofday()
. Unfotunately in Python it's not so. Checked in Linux X86_64 kernel 6.5.7, CPython 3.11.6Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
The text was updated successfully, but these errors were encountered: