Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPIC: Reproducible Builds #2522

Open
smlambert opened this issue Mar 9, 2021 · 12 comments
Open

EPIC: Reproducible Builds #2522

smlambert opened this issue Mar 9, 2021 · 12 comments
Assignees
Labels
compatibility Issues that relate to how our code works with other third party code bases documentation Issues that request updates to our documentation enhancement Issues that enhance the code or documentation of the repo in any way epic Issues that are large and likely multi-layered features or refactors testing Issues that enhance or fix our test suites

Comments

@smlambert
Copy link
Contributor

smlambert commented Mar 9, 2021

Related to AdoptOpenJDK/TSC#158, this issue is to layout some of the build activities needed in order to achieve reproducible builds (as defined by https://reproducible-builds.org/). Creating this here, as it really should be independent of CI system. A reproducible system may eventually be fed to Jenkins/Azure Devops or used on the command line, but ideally should not be restricted to just one approach.

  • Assess what information we currently gather and have access to (including release file, console output, log files)
  • Start to populate a .buildInfo file (a format for this Bill of Materials is defined here: https://reproducible-builds.org/docs/jvm/)
  • Identify what pieces of information we are missing from populating a complete .buildInfo file
  • Extend build scripts to be able to take a .buildInfo file as input for rerunning and reproducing the build (first iterations of this work do not need to achieve exact same builds, but rather should be able to identify pieces that are not the same)

Since infrastructure and test also form a part of this story, we will also identify how we need to reproduce those aspects of a full release build as well.

Related EPICs and issues:

@smlambert smlambert added the enhancement Issues that enhance the code or documentation of the repo in any way label Mar 9, 2021
@M-Davies M-Davies added compatibility Issues that relate to how our code works with other third party code bases documentation Issues that request updates to our documentation labels Mar 9, 2021
@M-Davies M-Davies added the epic Issues that are large and likely multi-layered features or refactors label Mar 9, 2021
@andrew-m-leonard
Copy link
Contributor

Current build info doc: #2526

@andrew-m-leonard
Copy link
Contributor

An interesting conference talk on Reproducible Debian builds: https://debconf17.debconf.org/talks/91/

@andrew-m-leonard
Copy link
Contributor

andrew-m-leonard commented Mar 16, 2021

An interesting thought comes to mind from reading material around reproducible builds which conflicts a bit with how AdoptOpenJDK currently builds & tests.
A) AdoptOpenJDK build and test machines are setup via Ansible(mainly), with various required dependencies, and not necessarily rigidly the exact same versions,eg.some yum/deb dependencies are just the "latest" at the time of install. Then Builds are built on these nodes that match the required "labels".
B) The more rigid "Reproducible Build" approach, whereby a "raw" node would be selected by platform/OS, and then using the .buildinfo, ALL of the required dependencies are installed/downgraded to the exact required levels.

B) Is more strict and precise.

To get to (B) we would have to change our "approach"....?

@sxa @smlambert some food for thought?

@sxa
Copy link
Member

sxa commented Mar 16, 2021

Not really food for thought ... As I've said before there's a limit as to how strict we can easily be on this. We either take this hit of trying to be able to replicate the precise build environment with all the headaches and side effects that potentially incurs to do it cross platform, or we restrict it and put processes in place to ensure we can rebuild just the last GA level of each release we have in support. Either option takes extra work ... And requires people available to prioritise it over other things before we can change things in this area, and we don't currently have any of those.

@andrew-m-leonard
Copy link
Contributor

@tellison your thoughts? #2522 (comment)

@aahlenst
Copy link
Contributor

Having stateful workers is a problem we have regardless of whether we implement reproducible builds or not. It causes a lot of maintenance work and is a security problem. Addressing this alone would improve the health of the project by a lot (and the resource utilization). Creating predefined images with Ansible and Packer is solved problem and works well across operating systems (doing it at work for Linux, Windows, macOS). The big problem we have is that no off the shelf offering can spin up immutable instances on demand on all platforms we support. I want to investigate the options with a group of students (to be found). HashiCorp's Nomad could be an option. It's a good cross-platform workload scheduler that's interoperable with Jenkins, but it would need some extensions (like being able to talk to Parallels Desktop on the Mac). Or we go just down the K8s route, unfortunately with even more unknowns. As always, solving this needs time or money. The topic is so exotic that I doubt that it can be sold to a research funding agency.

@andrew-m-leonard
Copy link
Contributor

@aahlenst just to make sure I understand what you mean by "stateful workers", that is basically the fact that our build nodes are a defined configuration/dependency level (ie.a given "state"), as opposed to "stateless" which would be a raw node with no pre-defined dependency level...?

@aahlenst
Copy link
Contributor

@andrew-m-leonard Stateful workers means that any change performed by a job remains on the machine forever and is persistent across job runs. A stateless worker is a worker that has the same state at the beginning of every job. GitHub Actions work like that or AppVeyor. We have an AppVeyor Enterprise cluster at work that I maintain (I spend 1-2 days on it per month). Every month, I create new images. Packer installs Ubuntu, Windows Server etc. from ISO into a VM using d-i and autounattend.xml. For macOS, we use a preconfigured macOS VM because it's impossible to automate the macOS installer. Packer then configures the VMs using Ansible. The playbooks install a ton of dependencies like runtimes (.NET, AdoptOpenJDK, Python, Ruby, ...), emulators (Android, iPhone), and tooling (Xcode, packer, gh, ...). We know exactly what software is installed on those images and what versions. You can even download a one year old image from Artifactory and examine it. The images are then uploaded onto the 7 build servers we have. Every time AppVeyor Enterprise receives a job via a GitHub webhook, it provisions a new VM from the requested image and runs the build within it and can use the preinstalled tooling or install additional software (it does not make sense to preinstall everything). Afterwards, the VM is thrown away. When the next job arrives, it gets a fresh VM. It's rock solid and almost zero maintenance. Unfortunately, there's no libvirt driver, so we cannot use it for all of AdoptOpenJDK.

@andrew-m-leonard
Copy link
Contributor

@sxa I think i'm with you there, lets just target our first goal of "reproducing the last supported GA release", we can get there then decide where next...? I'm not keen on boiling the ocean!

@andrew-m-leonard
Copy link
Contributor

@smlambert the end-goal on this is rather varied depending on who you talk to, but I think that's perhaps not a helpful thing to think about. I'm wondering if maybe the best thing to do is to prototype some initial idea, to set us off in the right direction... Maybe:

  • Create a "Reproduce-openjdk-pipeline" which takes as parameters the basic SHAs of what to build with:
  • Tooling:
    • openjdk-build SHA
    • ci-jenkins-pipelines SHA
    • openjdk-tests SHA
    • TKG SHA
  • Source:
    • openjdk repo (Adopt hotspot mirror or openj9 ext repo...)
    • openjdk repo SHA

This will no doubt discover issues, which we can raise... things like this come to mind:

  • openj9 trying to pull/use a version of openssl that is no longer on the Windows build node...

@smlambert
Copy link
Contributor Author

Agree, prototyping is always a good way to start, then we have something to assess, discuss and improve.

I think you can restrict the initial prototype even further, to just the build piece.

We can prototype the test pipeline piece separately and connect the 2 later.

@karianna
Copy link
Contributor

karianna commented Aug 10, 2021

One extra thought - we should make sure that we have docs and tools to allow users to audit this easily so that we can have a user easily run this.

@github-actions github-actions bot added the testing Issues that enhance or fix our test suites label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility Issues that relate to how our code works with other third party code bases documentation Issues that request updates to our documentation enhancement Issues that enhance the code or documentation of the repo in any way epic Issues that are large and likely multi-layered features or refactors testing Issues that enhance or fix our test suites
Projects
None yet
Development

No branches or pull requests

6 participants