Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Play around with variable-length instructions (DRAFT!) #101160

Closed
wants to merge 17 commits into from

Conversation

gvanrossum
Copy link
Member

This picks a silly example intentionally: replace the sequence LOAD_CONST, MAKE_FUNCTION with a variable-length instruction MAKE_FUNCTION_FROM_CODE. The replacement is done as part of quickening (so effectively at the end of compilation). I plan to implement faster-cpython/ideas#540 and faster-cpython/ideas#539 (though the current code for EXTENDED_ARG_3 is different -- it only extends oparg3, so that I can patch it on top of EXTENDED_ARG when quickening).

Things I've learned so far:

  • It was useful to swap the opargs compared to the original instruction pair -- this helped me realize that locale.py has lots of constants so there was an interesting bug (now fixed) when we had EXTENDED_ARG 5, LOAD_CONST n, MAKE_FUNCTION 0.
  • The lldebug machinery is subtly broken: the output is sometimes messed up, and printing the stack sometimes makes it enter infinite recursion. But it did help me identify the previous problem.
  • After adding a new opcode, don't forget to run make regen-opcode regen-opcode-targets.

(This version is a one-word instruction that only extends oparg3,
so we can replace EXTENDED_ARG with it.)
@gvanrossum
Copy link
Member Author

BTW I didn't bother to fix test_dis.py, it's a chore.

@gvanrossum
Copy link
Member Author

Some next steps:

  • Get rid of the code in specialize.c that replaces LOAD_CONST + MAKE_FUNCTION with MAKE_FUNCTION_FROM_CODE; it's now done in the compiler.
  • Redesign EXTENDED_ARG_3 to actually take 3 args. The interim version only took one arg so I could patch an existing EXTENDED_ARG in specialize.c, which is no longer needed.
  • Teach the compiler all about EXTENDEDED_ARG_3, and other parts of the code that need to know.
  • Everywhere I check for MAKE_FUNCTION_FROM_CODE, instead use a check for opcode size (this probably requires generating a table mapping opcodes to their opcode size).

A big open question: What will become of the code that "searches backwards" for the full oparg (in get_arg() in frameobject.c)? I've always thought that was dicey, but it may become untenable (even for unquickened code).

It's now done in the compiler.
Don't delete it, since this is only an experiment.
But assure it's not used anywhere.
@gvanrossum
Copy link
Member Author

gvanrossum commented Jan 23, 2023

A big open question: What will become of the code that "searches backwards" for the full oparg (in get_arg() in frameobject.c)? I've always thought that was dicey, but it may become untenable (even for unquickened code).

Oh, that should just be done by properly decoding the instructions as we go in mark_stacks(). Basically 3-arg instructions invalidate all bytecode scanning code that assumes every word is an instruction. I should have a look at what Irit did in her register machine branch. I might just punt this for now and let mark_stacks() or frame_setlineno() always raise an exception.

(UPDATE) There's another bytecode scanner in _PyFrame_OpAlreadyRan(). Sigh.

@arhadthedev
Copy link
Member

As an alternative: can we keep fixed-length instructions but compress the whole pyc file with zlib instead?

The procedure would be: serialize the whole pyc-file into a memory buffer and deflate in onto a disc; to load it back inflate it into the memory in full, close the file and do the rest as usual.

It should work because benefits of memory mapping for execution is just an idea for now. Moreover, it would be irrelevant with faster-cpython/ideas#538 allowing to drop the whole inflated buffer immediately after quickening all its content.

@gvanrossum
Copy link
Member Author

As an alternative: can we keep fixed-length instructions but compress the whole pyc file with zlib instead?

But what length should instructions have? 4 bytes still isn't enough for operators like BINARY_OP_R that need four arguments (and yes, we debated endlessly if we could do it with three -- the answer is, not easily). There's also cache sizes.

Compressing with zlib is a sepatate topic, we could do that at any time but we've never felt that disk space for PYC files is much of a problem, so why would we spend the extra CPU and memory on (de)compression.

Also, you probably meant to follow up in one of the "ideas" issues, not in this PR.

@gvanrossum
Copy link
Member Author

According to the test output, test_dis is the only failing test. I'm still not doing anything about it.

@arhadthedev
Copy link
Member

Also, you probably meant to follow up in one of the "ideas" issues, not in this PR.

My bad; I posted an answer on instruction length in faster-cpython/ideas#540 (comment).

@gvanrossum
Copy link
Member Author

As this PR is in limbo while I figure out how to complete this work, I might just save some less complicated pieces of it in a new PR.

@AlexWaygood AlexWaygood changed the title Play around with variable-lengtth instructions (DRAFT!) Play around with variable-length instructions (DRAFT!) Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants