Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-113464: Speed up JIT builds #122839

Merged
merged 2 commits into from
Aug 14, 2024
Merged

Conversation

brandtbucher
Copy link
Member

@brandtbucher brandtbucher commented Aug 8, 2024

The current JIT build includes the code for every instruction when compiling each stencil, even though only one of the cases is used.

This extracts the desired cases and compiles them each in isolation, which makes JIT builds almost twice as fast (except on Windows, where I suspect we're still bound by subprocess creation time).

@brandtbucher brandtbucher added build The build process and cross-build topic-JIT labels Aug 8, 2024
@brandtbucher brandtbucher self-assigned this Aug 8, 2024
@bedevere-app bedevere-app bot mentioned this pull request Aug 8, 2024
Copy link
Member

@savannahostrowski savannahostrowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only question is how, if at all, memory usage is impacted by this change, given the independent compilation.

Otherwise, this is just a tiny comment about adding a comment.

for opname in opnames:
coro = self._compile(opname, TOOLS_JIT_TEMPLATE_C, work)
template = TOOLS_JIT_TEMPLATE_C.read_text()
for case, opname in cases_and_opnames:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be good to add a comment here about why each of these is compiled independently (e.g., because of the performance benefit)? I'm not sure that would be abundantly clear if I just stumbled upon this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll add a comment.

@brandtbucher
Copy link
Member Author

My only question is how, if at all, memory usage is impacted by this change, given the independent compilation.

The good news is that memory usage should go down, though temporary disk usage will go up.

We were always compiling each opcode independently, but all of the other C code for all of the opcodes not being compiled was part of the same file. Now we are writing out files to disk containing just the C code for each opcode.

As an example, before, we would compile this three times and extract the machine code of each opcode afterwards:

_Py_CODEUNIT *
_JIT_ENTRY(...)
{
    // ...
    switch (FOO) {
        case FOO: {
            // Code for FOO...
        }
        case BAR: {
            // Code for BAR (unused here)...
        }
        case BAZ: {
            // Code for BAZ (unused here)...
        }
    }
    // ...
}
_Py_CODEUNIT *
_JIT_ENTRY(...)
{
    // ...
    switch (BAR) {
        case FOO: {
            // Code for FOO (unused here)...
        }
        case BAR: {
            // Code for BAR...
        }
        case BAZ: {
            // Code for BAZ (unused here)...
        }
    }
    // ...
}
_Py_CODEUNIT *
_JIT_ENTRY(...)
{
    // ...
    switch (BAZ) {
        case FOO: {
            // Code for FOO (unused here)...
        }
        case BAR: {
            // Code for BAR (unused here)...
        }
        case BAZ: {
            // Code for BAZ...
        }
    }
    // ...
}

Now, we only include the necessary case each time, so the C compiler doesn't waste time parsing a bunch of dead C code for the other cases

_Py_CODEUNIT *
_JIT_ENTRY(...)
{
    // ...
    switch (FOO) {
        case FOO: {
            // Code for FOO...
        }
    }
    // ...
}
_Py_CODEUNIT *
_JIT_ENTRY(...)
{
    // ...
    switch (BAR) {
        case BAR: {
            // Code for BAR...
        }
    }
    // ...
}
_Py_CODEUNIT *
_JIT_ENTRY(...)
{
    // ...
    switch (BAZ) {
        case BAZ: {
            // Code for BAZ...
        }
    }
    // ...
}

Hopefully that helps explain what's going on here? It's a bit weird.

@brandtbucher brandtbucher merged commit 5118592 into python:main Aug 14, 2024
63 checks passed
blhsing pushed a commit to blhsing/cpython that referenced this pull request Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build The build process and cross-build skip news topic-JIT
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants