Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially remove all those pesky .pyi files #1412

Open
BuildStream-Migration-Bot opened this issue Feb 4, 2021 · 6 comments
Open

Potentially remove all those pesky .pyi files #1412

BuildStream-Migration-Bot opened this issue Feb 4, 2021 · 6 comments
Labels
refactoring Reduce the cost of ownership of our codebase

Comments

@BuildStream-Migration-Bot

See original issue on GitLab
In GitLab by [Gitlab user @tristanvb] on Dec 2, 2020, 10:07

Apparently, it is possible to encode pep484 type information directly into cython modules

It could be nice to remove all the redundant forward declarations...

Needs investigation, possibly we need to use a mypyc extension for validation in CI.

@BuildStream-Migration-Bot BuildStream-Migration-Bot added the refactoring Reduce the cost of ownership of our codebase label Feb 4, 2021
@BuildStream-Migration-Bot
Copy link
Author

In GitLab by [Gitlab user @tristanvb] on Dec 2, 2020, 10:11

changed the description

@abderrahim
Copy link
Contributor

@gtristan I tried to look into this: #1547 ports buildstream._types and buildstream._utils to use the pure-python cython syntax (The former is already in python, the latter needed some porting).

Regarding mypyc, it seems to be an alternative to cython based on mypy rather than an extension for mypy to support checking cython code. We could probably try to use it instead of cython at some point but the drawback is that it can't do things you can't do in pure python (In particular, it seems buildstream._utils is trying to call into cpython internals to raise an exception in a thread).

@gtristan
Copy link
Contributor

@gtristan I tried to look into this: #1547 ports buildstream._types and buildstream._utils to use the pure-python cython syntax (The former is already in python, the latter needed some porting).

Regarding mypyc, it seems to be an alternative to cython based on mypy rather than an extension for mypy to support checking cython code. We could probably try to use it instead of cython at some point but the drawback is that it can't do things you can't do in pure python (In particular, it seems buildstream._utils is trying to call into cpython internals to raise an exception in a thread).

Thanks for taking a look and clarifying. This issue was mostly a hopeful thought as I came across the referenced URLs and wanted to note these down.

@abderrahim
Copy link
Contributor

I've dug a little bit deeper on this trying to port a bigger module (_node.pyx). Newer Cython looks promising, but there are still some blockers like cython/cython#3957 and cython/cython#4252

@gtristan
Copy link
Contributor

Sorry for lack of feedback here I haven't had time to keep up.

Now that I look closely it looks like there is something called pure python mode in cython. Reading that page, I doubt we want to get rid of the .pyx files in some of the hotter code paths (e.g. _node.pyx and _variables.pyx), as these should get better performance by leveraging cython constructs.

@abderrahim
Copy link
Contributor

There are actually three "pure python" modes in cython:

  1. good old "pure python" code
  2. good old "pure python" + hot code paths rewritten in cython syntax in a separate .pxd file
  3. pure python with special cython decorators

All three can be executed as pure python (3. needs to have cython installed), and can be compiled for performance gain (1. offers modest gains, while 2. and 3. can leverage the full potential of cython). 2. has the obvious downside of having to keep two implementations in sync for the hot paths.

What I was investigating is 3. which should allow us to get rid of the cython syntax without losing the other advantages of cython. Currentyl the only missing feature I noticed is cdef enum (cython/cython#4252). Then there is the huge bug that shows that nobody is currently using this in production (cython/cython#3957).

What I'm currently investigating is 4. mypyc. mypyc is an alternative to cython. It is used by two tools we depend on (mypy and black) so it's used in production (even though it is still considered alpha software). mypyc compiles (a growing subset of) real pure python with type annotations. It is supposed to get a modest speedup when compiling code without type annotations, and a good speedup when adding type annotations.

The upside is that we'd end up with real pure python that can be executed without a compilation step during e.g. debug-edit cycle, and potentially compile the whole tool (with heavy type annotations on the hot paths) on release.

Currently, trying to port _node.pyx, I'm struggling mainly with the fact that mypy takes None-ability very seriously. So if something can be None, it should be checked before being used.

I think there is a buildstream benchmark somewhere. It would be nice to use it to test this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Reduce the cost of ownership of our codebase
Projects
None yet
Development

No branches or pull requests

3 participants