Skip to content

Conversation

@github-actions
Copy link

@github-actions github-actions bot commented Dec 2, 2025

With .pyc files removal after compilation we save very little
space. Uncompressed sizes of regular airflow image are:

Before 7.63GB
After 7.66GB

So we have images bigger by < 0.5%

And it seems that long running containers without those files can
suffer from continuous attempts to recreate the .pyc files that
fail due to lack of permissions and cause negative dentries to
be continuously created:

https://lwn.net/Articles/814535/

Those negative dentries are created by kernel - caching the fact
that a file was not available - which speeds up lookup but also
takes a bit of memory. It seems that when compiled Python has
the .pyc files removed, it tries to recreate them with timestamped
entries every time new interpreter is started.

While this is not a problem for long running processes - because
those interpreters are run exactly once per container, this is
a problem if you use exec in containers to run Health Checks.

Evey health-check creates a new interpreter and every time it is
created, a new negative dentries to take kernel memory.

By not removing the .pyc files we increase a bit the size of the
image but improve a little the startup time (no need to compile
Python internal .py files, as well as get rid of the negative
dentries problem.

This PR likely:
(cherry picked from commit bcda508)

Co-authored-by: Jarek Potiuk jarek@potiuk.com
Fixes: #58509
Fixes: #42195

…58944)

With .pyc files removal after compilation we save very little
space. Uncompressed sizes of regular airflow image are:

Before  7.63GB
After   7.66GB

So we have images bigger by < 0.5%

And it seems that long running containers without those files can
suffer from continuous attempts to recreate the .pyc files that
fail due to lack of permissions and cause negative dentries to
be continuously created:

https://lwn.net/Articles/814535/

Those negative dentries are created by kernel - caching the fact
that a file was not available - which speeds up lookup but also
takes a bit of memory. It seems that when compiled Python has
the .pyc files removed, it tries to recreate them with timestamped
entries every time new interpreter is started.

While this is not a problem for long running processes - because
those interpreters are run exactly once per container, this is
a problem if you use `exec` in containers to run Health Checks.

Evey health-check creates a new interpreter and every time it is
created, a new negative dentries to take kernel memory.

By not removing the .pyc files we increase a bit the size of the
image but improve a little the startup time (no need to compile
Python internal .py files, as well as get rid of the negative
dentries problem.

This PR likely:
(cherry picked from commit bcda508)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Fixes: #58509
Fixes: #42195
@boring-cyborg boring-cyborg bot added area:dev-tools area:production-image Production image improvements and fixes backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch kind:documentation labels Dec 2, 2025
@ephraimbuddy ephraimbuddy added type:misc/internal Changelog: Misc changes that should appear in change log and removed backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch labels Dec 2, 2025
@ephraimbuddy ephraimbuddy marked this pull request as ready for review December 2, 2025 15:06
@ephraimbuddy ephraimbuddy merged commit ba9a6a9 into v3-1-test Dec 2, 2025
87 checks passed
@ephraimbuddy ephraimbuddy deleted the backport-bcda508-v3-1-test branch December 2, 2025 17:23
ephraimbuddy pushed a commit that referenced this pull request Dec 3, 2025
…58944) (#58947)

With .pyc files removal after compilation we save very little
space. Uncompressed sizes of regular airflow image are:

Before  7.63GB
After   7.66GB

So we have images bigger by < 0.5%

And it seems that long running containers without those files can
suffer from continuous attempts to recreate the .pyc files that
fail due to lack of permissions and cause negative dentries to
be continuously created:

https://lwn.net/Articles/814535/

Those negative dentries are created by kernel - caching the fact
that a file was not available - which speeds up lookup but also
takes a bit of memory. It seems that when compiled Python has
the .pyc files removed, it tries to recreate them with timestamped
entries every time new interpreter is started.

While this is not a problem for long running processes - because
those interpreters are run exactly once per container, this is
a problem if you use `exec` in containers to run Health Checks.

Evey health-check creates a new interpreter and every time it is
created, a new negative dentries to take kernel memory.

By not removing the .pyc files we increase a bit the size of the
image but improve a little the startup time (no need to compile
Python internal .py files, as well as get rid of the negative
dentries problem.

This PR likely:
(cherry picked from commit bcda508)


Fixes: #58509
Fixes: #42195

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:production-image Production image improvements and fixes kind:documentation type:misc/internal Changelog: Misc changes that should appear in change log

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants