Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reproducibility of prepared provider packages (fix flit frontend) #43683

Merged

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Nov 5, 2024

After some checks it turned out that reproducibility of produced packages depends not only on the build backend configured for the project but also on the build front-end used - because frontend is the one to modify meta-data in prepared packages - including the build tool used, it's version and metadata version supported by the front-end.

That's why in order to maintain reproducibility for anyone who builds the packages, we have to pin not only the build backend in pyproject.toml (flit-core) but also build fronted used (flit).

Since package preparation is done with breeze, we can do it by pinning flit (and just in case also flit-core) so that anyone who builds specific version of the package will use exactly the same flit as the person who built the original packages.

This way we will avoid reproducibility problems experienced with 1.5.0 release of FAB.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

After some checks it turned out that reproducibility of produced
packages depends not only on the build backend configured for the
project but also on the build front-end used - because frontend is
the one to modify meta-data in prepared packages - including the build
tool used, it's version and metadata version supported by the front-end.

That's why in order to maintain reproducibility for anyone who builds
the packages, we have to pin not only the build backend in
pyproject.toml (flit-core) but also build fronted used (flit).

Since package preparation is done with breeze, we can do it by
pinning flit (and just in case also flit-core) so that anyone who
builds specific version of the package will use exactly the same flit
as the person who built the original packages.

This way we will avoid reproducibility problems experienced with 1.5.0
release of FAB.
@potiuk potiuk force-pushed the fix-reproducibility-of-provider-packages branch from fd2b06c to 0dcf4ce Compare November 5, 2024 10:06
Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ho wow

@potiuk potiuk merged commit 18ea01c into apache:main Nov 5, 2024
77 checks passed
@potiuk potiuk deleted the fix-reproducibility-of-provider-packages branch November 5, 2024 11:01
@@ -43,17 +43,3 @@ def get_python_version_list(python_versions: str) -> list[str]:
)
sys.exit(1)
return python_version_list

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW. This is removed because we do not need it any more - all our packages build with 3.12. They do not necessarily (apache.beam) WORK with 3.12 and we still have exclusion there, but at least packages can be built also with Pyhon 3.12 for them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @ashb I verified that indeed that exclusion is not needed. so I removed it, using the opportunity of fixing package reproducibility:

image

@potiuk potiuk added this to the Airflow 2.10.4 milestone Nov 5, 2024
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 5, 2024
…apache#43683)

After some checks it turned out that reproducibility of produced
packages depends not only on the build backend configured for the
project but also on the build front-end used - because frontend is
the one to modify meta-data in prepared packages - including the build
tool used, it's version and metadata version supported by the front-end.

That's why in order to maintain reproducibility for anyone who builds
the packages, we have to pin not only the build backend in
pyproject.toml (flit-core) but also build fronted used (flit).

Since package preparation is done with breeze, we can do it by
pinning flit (and just in case also flit-core) so that anyone who
builds specific version of the package will use exactly the same flit
as the person who built the original packages.

This way we will avoid reproducibility problems experienced with 1.5.0
release of FAB.

(cherry picked from commit 18ea01c)
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 5, 2024
…apache#43683)

After some checks it turned out that reproducibility of produced
packages depends not only on the build backend configured for the
project but also on the build front-end used - because frontend is
the one to modify meta-data in prepared packages - including the build
tool used, it's version and metadata version supported by the front-end.

That's why in order to maintain reproducibility for anyone who builds
the packages, we have to pin not only the build backend in
pyproject.toml (flit-core) but also build fronted used (flit).

Since package preparation is done with breeze, we can do it by
pinning flit (and just in case also flit-core) so that anyone who
builds specific version of the package will use exactly the same flit
as the person who built the original packages.

This way we will avoid reproducibility problems experienced with 1.5.0
release of FAB.

(cherry picked from commit 18ea01c)
potiuk added a commit that referenced this pull request Nov 5, 2024
…#43683) (#43687)

After some checks it turned out that reproducibility of produced
packages depends not only on the build backend configured for the
project but also on the build front-end used - because frontend is
the one to modify meta-data in prepared packages - including the build
tool used, it's version and metadata version supported by the front-end.

That's why in order to maintain reproducibility for anyone who builds
the packages, we have to pin not only the build backend in
pyproject.toml (flit-core) but also build fronted used (flit).

Since package preparation is done with breeze, we can do it by
pinning flit (and just in case also flit-core) so that anyone who
builds specific version of the package will use exactly the same flit
as the person who built the original packages.

This way we will avoid reproducibility problems experienced with 1.5.0
release of FAB.

(cherry picked from commit 18ea01c)
@gopidesupavan
Copy link
Member

Woohooo nice :)

@potiuk
Copy link
Member Author

potiuk commented Nov 5, 2024

Yeah. Build reproducibiliy is cool :) and surprisingly difficult.

ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
…apache#43683)

After some checks it turned out that reproducibility of produced
packages depends not only on the build backend configured for the
project but also on the build front-end used - because frontend is
the one to modify meta-data in prepared packages - including the build
tool used, it's version and metadata version supported by the front-end.

That's why in order to maintain reproducibility for anyone who builds
the packages, we have to pin not only the build backend in
pyproject.toml (flit-core) but also build fronted used (flit).

Since package preparation is done with breeze, we can do it by
pinning flit (and just in case also flit-core) so that anyone who
builds specific version of the package will use exactly the same flit
as the person who built the original packages.

This way we will avoid reproducibility problems experienced with 1.5.0
release of FAB.
utkarsharma2 pushed a commit that referenced this pull request Dec 4, 2024
…#43683) (#43687)

After some checks it turned out that reproducibility of produced
packages depends not only on the build backend configured for the
project but also on the build front-end used - because frontend is
the one to modify meta-data in prepared packages - including the build
tool used, it's version and metadata version supported by the front-end.

That's why in order to maintain reproducibility for anyone who builds
the packages, we have to pin not only the build backend in
pyproject.toml (flit-core) but also build fronted used (flit).

Since package preparation is done with breeze, we can do it by
pinning flit (and just in case also flit-core) so that anyone who
builds specific version of the package will use exactly the same flit
as the person who built the original packages.

This way we will avoid reproducibility problems experienced with 1.5.0
release of FAB.

(cherry picked from commit 18ea01c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants