-
Notifications
You must be signed in to change notification settings - Fork 60
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add pythonizations for collection subscript (#570)
* added pythonizer base class * added collection subscript pythonization * Update python/podio/pythonizations/__init__.py Co-authored-by: Juan Miguel Carceller <22276694+jmcarcell@users.noreply.github.com> * collection `__getitem__` uses `at` wrapped to throw python exception * fix exception stacktrace readability * Update python/podio/pythonizations/collection_subscript.py Co-authored-by: Thomas Madlener <thomas.madlener@desy.de> * split pythonization callback to predicate and modifcation * added documentation * Applied suggestions in docs Co-authored-by: Thomas Madlener <thomas.madlener@desy.de> --------- Co-authored-by: Juan Miguel Carceller <22276694+jmcarcell@users.noreply.github.com> Co-authored-by: Thomas Madlener <thomas.madlener@desy.de>
- Loading branch information
1 parent
0222a07
commit a18229b
Showing
8 changed files
with
174 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Python interface for data models | ||
|
||
Podio provides support for a Python interface for the generated data models. The [design choice](design.md) to create Python interface resembling the C++ interface is achieved by generating Python bindings from the C++ interface using | ||
[cppyy](https://cppyy.readthedocs.io/en/latest/index.html). To make pyROOT aware of the bindings, the cppyy functionality bundled with ROOT can be used. | ||
|
||
It's important to note that cppyy loads the bindings and presents them lazily at runtime to the Python interpreter, rather than writing Python interface files. Consequently, the Python bindings have a runtime dependencies on ROOT, cppyy and the data model's C++ interface. | ||
|
||
To load the Python bindings from a generated C++ model dictionary, first make sure the model's library and headers can be found in `LD_LIBRARY_PATH` and `ROOT_INCLUDE_HEADERS` respectively, then: | ||
|
||
```python | ||
import ROOT | ||
|
||
res = ROOT.gSystem.Load('libGeneratedModelDict.so') | ||
if res < 0: | ||
raise RuntimeError('Failed to load libGeneratedModelDict.so') | ||
``` | ||
|
||
For reference usage, see the [Python module of EDM4hep](https://github.com/key4hep/EDM4hep/blob/main/python/edm4hep/__init__.py). | ||
|
||
## Pythonizations | ||
|
||
Python as a language uses different constructions and conventions than C++, perfectly fine C++ code translated one to one to Python could be clunky by Python's standard. cppyy offers a mechanism called [pythonizations](https://cppyy.readthedocs.io/en/latest/pythonizations.html) to make the resulting bindings more pythonic. Some basic pythonizations are included automatically (for instance `operator[]` is translated to `__getitem__`) but others can be specified by a user. | ||
|
||
Podio comes with its own set of pythonizations useful for the data models generated with it. To apply all the provided pythonizations to a `model_namespace` namespace: | ||
|
||
```python | ||
from podio.pythonizations import load_pythonizations | ||
|
||
load_pythonizations("model_namespace") | ||
``` | ||
|
||
If only specific pythonizations should be applied: | ||
|
||
```python | ||
from podio.pythonizations import collection_subscript # specific pythonization | ||
|
||
collection_subscript.CollectionSubscriptPythonizer.register("model_namespace") | ||
``` | ||
|
||
### Developing new pythonizations | ||
|
||
To be discovered by `load_pythonizations`, any new pythonization should be placed in `podio.pythonizations` and be derived from the abstract class `podio.pythonizations.utils.pythonizer.Pythonizer`. | ||
|
||
A pythonization class should implement the following three class methods: | ||
|
||
- `priority`: The `load_pythonizations` function applies the pythonizations in increasing order of their `priority` | ||
- `filter`: A predicate to filter out classes to which given pythonization should be applied. See the [cppyy documentation](https://cppyy.readthedocs.io/en/latest/pythonizations.html#python-callbacks). | ||
- `modify`: Applying the modifications to the pythonized classes. | ||
|
||
### Considerations | ||
|
||
The cppyy pythonizations come with some considerations: | ||
|
||
- The general cppyy idea to lazily load only things that are needed applies only partially to the pythonizations. For instance, a pythonization modifying the `collection[]` will be applied the first time a class of `collection` is used, regardless if `collection[]` is actually used. | ||
- Each pythonization is applied to all the entities in a namespace and relies on a conditional mechanism (`filter` method) inside the pythonizations to select entities they modify. With a large number of pythonizations, the overheads will add up and slow down the usage of any class from a pythonized namespace. | ||
- The cppyy bindings hooking to the C++ routines are characterized by high performance compared to ordinary Python code. The pythonizations are written in Python and are executed at ordinary Python code speed. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
"""cppyy pythonizations for podio""" | ||
|
||
from importlib import import_module | ||
from pkgutil import walk_packages | ||
from .utils.pythonizer import Pythonizer | ||
|
||
|
||
def load_pythonizations(namespace): | ||
"""Register all available pythonizations for a given namespace""" | ||
module_names = [name for _, name, _ in walk_packages(__path__) if not name.startswith("test_")] | ||
for module_name in module_names: | ||
import_module(__name__ + "." + module_name) | ||
pythonizers = sorted(Pythonizer.__subclasses__(), key=lambda x: x.priority()) | ||
for i in pythonizers: | ||
i.register(namespace) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
"""Pythonize subscript operation for collections""" | ||
|
||
import cppyy | ||
from .utils.pythonizer import Pythonizer | ||
|
||
|
||
class CollectionSubscriptPythonizer(Pythonizer): | ||
"""Bound-check __getitem__ for classes derived from podio::CollectionBase""" | ||
|
||
@classmethod | ||
def priority(cls): | ||
return 50 | ||
|
||
@classmethod | ||
def filter(cls, class_, name): | ||
return issubclass(class_, cppyy.gbl.podio.CollectionBase) | ||
|
||
@classmethod | ||
def modify(cls, class_, name): | ||
def get_item(self, i): | ||
try: | ||
return self.at(i) | ||
except cppyy.gbl.std.out_of_range: | ||
raise IndexError("collection index out of range") from None | ||
|
||
class_.__getitem__ = get_item |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
"""cppyy pythonizations for podio""" | ||
|
||
from abc import ABCMeta, abstractmethod | ||
import cppyy | ||
|
||
|
||
class Pythonizer(metaclass=ABCMeta): | ||
""" | ||
Base class to define cppyy pythonization for podio | ||
""" | ||
|
||
@classmethod | ||
@abstractmethod | ||
def priority(cls): | ||
"""Order in which the pythonizations are applied | ||
Returns: | ||
int: Priority | ||
""" | ||
|
||
@classmethod | ||
@abstractmethod | ||
def filter(cls, class_, name): | ||
""" | ||
Abstract classmethod to filter classes to which the pythonizations should be applied | ||
Args: | ||
class_ (type): Class object. | ||
name (str): Name of the class. | ||
Returns: | ||
bool: True if class should be pythonized. | ||
""" | ||
|
||
@classmethod | ||
@abstractmethod | ||
def modify(cls, class_, name): | ||
"""Abstract classmethod modifying classes to be pythonized | ||
Args: | ||
class_ (type): Class object. | ||
name (str): Name of the class. | ||
""" | ||
|
||
@classmethod | ||
def register(cls, namespace): | ||
"""Helper method to apply the pythonization to the given namespace | ||
Args: | ||
namespace (str): Namespace to by pythonized | ||
""" | ||
|
||
def pythonization_callback(class_, name): | ||
if cls.filter(class_, name): | ||
cls.modify(class_, name) | ||
|
||
cppyy.py.add_pythonization(pythonization_callback, namespace) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters