-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Mojo-Python slice interoperability #3549
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jjvraw this is awesome, thank you for tackling it!
assert_equal("[]", str(a[5:])) | ||
assert_equal("[5, 4, 3, 2]", str(a[:-5:-1])) | ||
|
||
var b = Python.evaluate("[i for i in range(1000)]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had no idea we could already do this 🤯
_ = d[1:3] | ||
|
||
var custom = Python.evaluate( | ||
"type('CustomSliceable', (), {'__getitem__': lambda self, key: key})()" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done and seen some shady metaclass manipulation in Python but I've never seen this manual use of type
🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shady indeed: SO post
docs/changelog.md
Outdated
|
||
```mojo | ||
a = PythonObject([1, 2, ,3, 4]) | ||
a_reversed = a[::-1] | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (changelog file is very long already):
```mojo | |
a = PythonObject([1, 2, ,3, 4]) | |
a_reversed = a[::-1] | |
``` | |
```mojo | |
print(PythonObject([1, 2, ,3, 4])[::-1]) # [4, 3, 2, 1] |
Going to mark as draft, and wait for |
Made some additional changes. Since the compiler doesn't implicitly handle Mojo-Python conversion for slice, and the syntax Can always revert these, however. |
with_2d = Python.evaluate( | ||
"""type('With2D', (), { | ||
'__init__': lambda self: setattr(self, 'data', [[1, 2, 3], [4, 5, 6], [7, 8, 9]]), | ||
'__getitem__': lambda self, key: ( | ||
[row[key[1]] for row in self.data[key[0]]] if isinstance(key, tuple) and all(isinstance(k, slice) for k in key) | ||
else (self.data[key[0]][key[1]] if isinstance(key, tuple) | ||
else self.data[key]) | ||
) | ||
})()""" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, would integrating something like stdlib/test/python/my_module.py
be preferred over this..? @JoeLoser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think refactoring these sort of things to a separate test file (class with methods like you have here) may improve readability over raw strings. Feel free to defer to a follow-up though :)
…r handling (#48447) [External] [stdlib] Integrate `PyObject_SetItem` for speedup and error handling Following the trend of: - #3549 - #3583 Again, noticing ~1.4x speedup on basic benchmarks: ```mojo for _ in range(iterations): var idx_a = random_si64(0, 999) var idx_b = random_si64(0, 9999) var idx_c = random_si64(0, 99999) a[idx_a] = idx_a b[idx_b] = idx_b c[idx_c] = idx_c ``` and improved error handling, preventing crashes. Co-authored-by: Joshua James Venter <venter.joshua@gmail.com> Closes #3595 MODULAR_ORIG_COMMIT_REV_ID: 107c0b478171f86869b7cb94e57f5ea59d83e994
self.Py_DecRef(py_step) | ||
|
||
return py_slice | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion This file (_cpython.mojo
) is mostly intended to be low level wrappers around functions in the CPython CAPI. Since PySlice_FromSlice
isn't a CPython function, I think the logic from this method might fit better manually inlined in the PythonObject
initializer:
struct PythonObject(..):
# ...
fn __init__(inout self, slice: Slice):
...
Additionally, instead of calling low-level methods like PyLong_FromLong()
or Py_None()
directly, the code might be improved by using the existing higher level safe wrappers for those functions (provided by PythonObject
as well). So the final code might look something like:
fn __init__(inout self, slice: Slice) -> PyObjectPtr:
# Convert Mojo Slice to Python slice parameters
# Note: Deliberately avoid using `span.indices()` here and instead pass
# the Slice parameters directly to Python. Python's C implementation
# already handles such conditions, allowing Python to apply its own slice
# handling.
var py_start = PythonObject(None)
var py_stop = PythonObject(None)
var py_step = PythonObject(slice.step)
if slice.start:
py_start = PythonObject(slice.start.value())
if slice.end:
py_stop = PythonObject(slice.end.value())
var py_slice = cpython.PySlice_New(
py_start.unsafe_as_py_object_ptr(),
py_stop.unsafe_as_py_object_ptr(),
py_step.unsafe_as_py_object_ptr()
)
return PythonObject(py_slice)
This has the advantage of allowing us to omit the manual ref count management logic as well 🙂
Let me know what you think, or if there's a subtlety I missed about how you're using that means this might not be feasible for your use-case.
For context, very roughly speaking, there are currently three files in the stdlib pertaining to interop, roughly in order from low-level to high-level:
-
_cpython.mojo
— low-level, "raw" bindings to the CPython API. In some future world, these might even become auto-generated bindings. They should have minimal logic. -
python_object.mojo
— definesPythonObject
, which interfaces between Mojo types and valid Python object values. -
python.mojo
— high-level, safe wrappers around functionality from the CPython API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thank you for the time and feedback! :)
Let me know what you think, or if there's a subtlety I missed about how you're using that means this might not be feasible for your use-case.
I originally wanted to avoid PythonObject
within __getitem__
, as i noticed marginal gains. This was the intent of PySlice_FromSlice
, to construct a PyObjectPtr
. However, I overlooked the fact that the other From*
functions are actually part of the C API.
I agree with your point about maintaining the minimalism of _cpython.mojo
. What do you think of refactoring PySlice_FromSlice
not inline of __init__
but rather as a helper within python_object.mojo
/python.mojo
? This way we can maintain PyObjectPtr
within __getitem__
.
Additionally, instead of calling low-level methods like PyLong_FromLong() or Py_None() directly, the code might be improved by using the existing higher level safe wrappers for those functions (provided by PythonObject as well). So the final code might look something like:
On this note, I noticed what seems to be lifetime issues. To apply this suggestion, I would have to do something like the following:
...
cpython = _get_global_python_itf().cpython()
var py_slice = cpython.PySlice_New(
py_start.unsafe_as_py_object_ptr(),
py_stop.unsafe_as_py_object_ptr(),
py_step.unsafe_as_py_object_ptr()
)
_ = py_start^
_ = py_stop^
_ = py_step^
return py_slice
This might be worth exploring, following 400866c
Hey @jjvraw thanks for tackling this, and all your related work in this area recently 🙂 I did a quick pass and made a couple of suggestions for some possible improvements. Let me know what you think, or if there's anything I might have missed. Thanks again for the 🔥 contributions, this is great polish work that will make interop that much more pleasant and intuitive to use 🙂 |
Hey Connor. No problem :) Please see my comment, and changes of the last commit 39a9e23. Note that I don't have a strong opinion on these, and happy to apply changes. |
…r handling (#48447) [External] [stdlib] Integrate `PyObject_SetItem` for speedup and error handling Following the trend of: - modularml#3549 - modularml#3583 Again, noticing ~1.4x speedup on basic benchmarks: ```mojo for _ in range(iterations): var idx_a = random_si64(0, 999) var idx_b = random_si64(0, 9999) var idx_c = random_si64(0, 99999) a[idx_a] = idx_a b[idx_b] = idx_b c[idx_c] = idx_c ``` and improved error handling, preventing crashes. Co-authored-by: Joshua James Venter <venter.joshua@gmail.com> Closes modularml#3595 MODULAR_ORIG_COMMIT_REV_ID: 107c0b478171f86869b7cb94e57f5ea59d83e994 Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>
!sync |
!sync |
Hmm, well both of the mentioned PRs broke the prior implementation--not sure if these were imported first. |
with_2d = Python.evaluate( | ||
"""type('With2D', (), { | ||
'__init__': lambda self: setattr(self, 'data', [[1, 2, 3], [4, 5, 6], [7, 8, 9]]), | ||
'__getitem__': lambda self, key: ( | ||
[row[key[1]] for row in self.data[key[0]]] if isinstance(key, tuple) and all(isinstance(k, slice) for k in key) | ||
else (self.data[key[0]][key[1]] if isinstance(key, tuple) | ||
else self.data[key]) | ||
) | ||
})()""" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think refactoring these sort of things to a separate test file (class with methods like you have here) may improve readability over raw strings. Feel free to defer to a follow-up though :)
var d = Python.dict() | ||
d["a"] = 1 | ||
d["b"] = 2 | ||
with assert_raises(contains="slice(1, 3, None)"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this fails internally in CI since it says unhashable type: 'slice'
. I haven't looked closely as to why it fails internally but passes in OSS CI (perhaps different CPython versions as we're using 3.11 internally it looks like primarily). I disabled the test so I can merge this, but we should look into it as a follow-up.
✅🟣 This contribution has been merged 🟣✅ Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours. We use Copybara to merge external contributions, click here to learn more. |
Landed in a41934b! Thank you for your contribution 🎉 |
[External] [stdlib] Mojo-Python slice interoperability Leveraging Mojo's `Slice` for more natural interop between `PythonObject`s, enabling something like below: ```mojo a = PythonObject([1, 2, 3, 4]) a_reversed = a[::-1] ``` - We leave C Python to handle slice parameters. - Negative cases such as `b[1.3:10]` or `b["1":10]` are handled by parser which would normally throw a `TypeError` in Python. Co-authored-by: Joshua James Venter <venter.joshua@gmail.com> Closes #3549 MODULAR_ORIG_COMMIT_REV_ID: d88cc6a73af06bfb317d6d993611f013c706bce7
Leveraging Mojo's
Slice
for more natural interop betweenPythonObject
s, enabling something like below:b[1.3:10]
orb["1":10]
are handled by parser which would normally throw aTypeError
in Python.