-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support 3.12 #246
Support 3.12 #246
Conversation
@Theelx Having a bit of trouble with this if you have some time to take a look. In 3.12 I'm getting an error when compiling:
Which looks like it is due to the types in the function: @cython.boundscheck(False)
@cython.wraparound(False)
cdef int python_trace_callback(object self_, PyFrameObject *py_frame, int what,
PyObject *arg): If I add @cython.boundscheck(False)
@cython.wraparound(False)
cdef int python_trace_callback(object self_, PyFrameObject *py_frame, int what,
PyObject *arg) noexcept: On 3.11 if I use @cython.boundscheck(False)
@cython.wraparound(False)
@cython.exceptval(check=False)
cdef int python_trace_callback(object self_, PyFrameObject *py_frame, int what,
PyObject *arg): But this seems to fail again on 3.12. From what I understand, we need to define a function with the same signature as Py_tracefunc. I'm not fluent in C symantics, so I don't know if a naked function like that is implicity "no except" or "possible except", but I assume it's a noexcept because it's integer return code is indicating the error state. I'm not sure why this worked in previous versions, perhaps cython got more restrictive about function signatures matching exception types? I'm also wondering if there is a condition that could cause an exception in that function and then require us to return |
on my phone atm but for the sake of a quick response, this is the same issue we got when going from cython 3.0a12 to b1 |
1 similar comment
on my phone atm but for the sake of a quick response, this is the same issue we got when going from cython 3.0a12 to b1 |
So I'm looking into why at the moment, but #203 doesn't have these issues with 3.12. Hopefully I can isolate what's causing the errors with 3.12 from that fork. |
@Erotemic I believe I just found the issue. Add |
Keeping the directives at the top of the pyx file lets you use the standalone cythonize tool instead of needing to invoke the entire setup.py, which I think is nice for developing. It probably also makes sense to indent to remove the legacy behavior and explicitly mark noexcept functions as such, so I' like to keep the directive as well unless it causes issues. The fix does allow me to compile on 3.12, but I'm still getting test failures I'll need to look into. For reference the failing tests are:
|
@Erotemic For getting rid of the Cython deprecation warnings which happen with the legacy implicit noexcept option, you can try putting @cython.boundscheck(False)
@cython.wraparound(False)
cdef extern int python_trace_callback(object self_, PyFrameObject *py_frame, int what,
PyObject *arg):
EDIT: This does not seem to fix the py.test tests, but I'm unsure why. It works fine for some personal scripts that I commonly use it to test for potential perf improvements. |
Using extern and no-decorator seem to compile just as well. I think the test failures must have to do with something in 3.12 itself, and not any Cython changes. Most other tests run fine. It's two tests for the new explicit profiler, and 1 test for the IPython magic that are the problem. Running
And on 3.12 it fails:
For the IPython test running tests/test_ipython.py::TestIPython::test_init Timer unit: 1e-09 s
Total time: 3.51e-07 s
File: <ipython-input-1-ae24564ca1b8>
Function: func at line 1
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 def func():
2 1 351.0 351.0 100.0 return 2**20
func_data=('<ipython-input-1-ae24564ca1b8>', 1, 'func')
lines_data=[(2, 1, 351)] In 3.12:
|
That's true, but the extern method is not deprecated, whereas the legacy compiler directive is, and iirc the decorator fails in pre-3.0 versions of Cython. Regardless, I agree that the test failures have something to do with 3.12,so I'll look into that. |
@Erotemic I wonder if maybe the testing harness is the issue? I just ran with_code_enable in its own file and it works fine. |
While pytest has caused issues for me in the past, I don't think it is the culprit here: Running outside of pytest via:
still results in the same errors. I also started up an IPython shell and ran the tests directly, and they also failed. |
@Theelx is there anything in this minimal working example that looks odd to you: def test_simple_explicit_nonglobal_usage():
from line_profiler import LineProfiler
import ubelt as ub
profiler = LineProfiler()
def func(a):
return a + 1
profiled_func = profiler(func)
# Run Zero Times
lstats = profiler.get_stats()
print('lstats.timings = {}'.format(ub.urepr(lstats.timings, nl=1)))
print('lstats.unit = {}'.format(ub.urepr(lstats.unit, nl=1)))
print(f'profiler.code_hash_map={profiler.code_hash_map}')
profiler.print_stats()
# Run Once
profiled_func(1)
lstats = profiler.get_stats()
print('lstats.timings = {}'.format(ub.urepr(lstats.timings, nl=1)))
print('lstats.unit = {}'.format(ub.urepr(lstats.unit, nl=1)))
print(f'profiler.code_hash_map={profiler.code_hash_map}')
profiler.print_stats()
# Run Multiple Times
for _ in range(10):
profiled_func(1)
lstats = profiler.get_stats()
print('lstats.timings = {}'.format(ub.urepr(lstats.timings, nl=1)))
print('lstats.unit = {}'.format(ub.urepr(lstats.unit, nl=1)))
print(f'profiler.code_hash_map={profiler.code_hash_map}')
profiler.print_stats()
test_simple_explicit_nonglobal_usage() In 3.11 it reports time correctly, but in 3.12 it is like the profiled function was never called. I'm not sure how anything can work if this simple case is failing. On 3.12:
On 3.11:
|
One thing that looks odd is that 3.11 shows the function as taking ~260us per hit, where it should take ~1us. I assume this is ubelt overhead, so my question is can this be replicated without using ubelt? |
Yes, ubelt is not necessary. It's also not the test harness. Here is a shell script that reproduces the example explicitly: echo "
def test_simple_explicit_nonglobal_usage():
from line_profiler import LineProfiler
profiler = LineProfiler()
def func(a):
return a + 1
profiled_func = profiler(func)
# Run Zero Times
lstats = profiler.get_stats()
print(f'lstats.timings={lstats.timings}')
print(f'lstats.unit={lstats.unit}')
print(f'profiler.code_hash_map={profiler.code_hash_map}')
profiler.print_stats()
# Run Once
profiled_func(1)
lstats = profiler.get_stats()
print(f'lstats.timings={lstats.timings}')
print(f'lstats.unit={lstats.unit}')
print(f'profiler.code_hash_map={profiler.code_hash_map}')
profiler.print_stats()
# Run Multiple Times
for _ in range(10):
profiled_func(1)
lstats = profiler.get_stats()
print(f'lstats.timings={lstats.timings}')
print(f'lstats.unit={lstats.unit}')
print(f'profiler.code_hash_map={profiler.code_hash_map}')
profiler.print_stats()
test_simple_explicit_nonglobal_usage()
" > mwe.py
python mwe.py In 3.12 the last print block is:
In 3.11 the last print block is:
The time seems to be about the same (although if I run for many more iterations, it goes to about 105), but isn't the unit in nanoseconds not microseconds?. I find myself easilly confused when dealing with units, so I typically like to write it out explicitly in pint, and then check that all conversions make sense. import pint
import ubelt as ub
ureg = pint.UnitRegistry()
s = ureg.seconds
ns = ureg.parse_expression('1 nanosecond')
μs = ureg.parse_expression('1 microsecond')
quantities = {}
quantities['prog_total'] = 2.3e-6 * ureg.seconds
quantities['line_total'] = 2328 * ns
quantities['perhit'] = 211 * ns
nhits = 11
quantities['line_total_derived'] = quantities['perhit'] * nhits
quantities = {k: v.to(μs) for k, v in quantities.items()}
print('quantities = {}'.format(ub.urepr(quantities, nl=1, align=':'))) This prints
which seems consistent to me when the numbers are interpreted as nanoseconds (which is 1e-9 as the timer unit reports). |
Maybe related: on 3.12.0 I see edit: only in async functions, so doesn't explain the mwe which I'm unable to reproduce. The example creating -1: import asyncio
@profile
async def function_1(depth):
if (0 < depth):
([_ async for _ in function_2(depth - 1)])
async def function_2(depth):
if (0 < depth):
yield ()
if (False):
yield
async def async_main():
async with asyncio.TaskGroup() as task_group:
task_group.create_task(function_1(2))
asyncio.run(async_main()) |
Ah yeah that's true. I forgot that Linux/BSD platforms use units of 1e-8 seconds instead of 1e-6 which used to be the default for all platforms. I am genuinely confused as to how this is happening, as it works fine for me when I install it and run it locally in the terminal on Ubuntu 22.04, with Python 3.12 behind pyenv. |
Just tried again and I'm able to reproduce now. Replacing diff --git a/line_profiler/_line_profiler.pyx b/line_profiler/_line_profiler.pyx
index f11b6b5..bfd5fbb 100644
--- a/line_profiler/_line_profiler.pyx
+++ b/line_profiler/_line_profiler.pyx
@@ -89,6 +89,8 @@ cdef extern from "Python.h":
cdef int PyTrace_C_EXCEPTION
cdef int PyTrace_C_RETURN
+ cdef int PyFrame_GetLineNumber(PyFrameObject *frame)
+
cdef extern from "timers.c":
PY_LONG_LONG hpTimer()
double hpTimerUnit()
@@ -425,7 +427,7 @@ cdef extern int python_trace_callback(object self_, PyFrameObject *py_frame,
if what == PyTrace_LINE or what == PyTrace_RETURN:
# Normally we'd need to DECREF the return from get_frame_code, but Cython does that for us
block_hash = hash(get_frame_code(py_frame))
- code_hash = compute_line_hash(block_hash, py_frame.f_lineno)
+ code_hash = compute_line_hash(block_hash, PyFrame_GetLineNumber(py_frame))
if self._c_code_map.count(code_hash):
time = hpTimer()
ident = threading.get_ident()
@@ -440,7 +442,7 @@ cdef extern int python_trace_callback(object self_, PyFrameObject *py_frame,
if what == PyTrace_LINE:
# Get the time again. This way, we don't record much time wasted
# in this function.
- self._c_last_time[ident][block_hash] = LastTime(py_frame.f_lineno, hpTimer())
+ self._c_last_time[ident][block_hash] = LastTime(PyFrame_GetLineNumber(py_frame), hpTimer())
elif self._c_last_time[ident].count(block_hash):
# We are returning from a function, not executing a line. Delete
# the last_time record. It may have already been deleted if we |
33aea57
to
dd29898
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #246 +/- ##
=======================================
Coverage 53.36% 53.36%
=======================================
Files 11 11
Lines 832 832
Branches 168 168
=======================================
Hits 444 444
Misses 329 329
Partials 59 59
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
@notEvil looks like that fixed it! I was wondering why this is the case and I found this: https://bugs.python.org/issue42823#msg384502 where Mark Shannon states:
So using the official C-API to get the line number is probably the correct thing to be doing. |
Updating CI to test Python 3.12