Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: switch to a bespoke test container #5683

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

gdams
Copy link
Member

@gdams gdams commented Oct 10, 2024

Also moved LIB_DIR and SYSTEM_LIB_DIR to be inside the current workspace for docker builds due to permissions errors

Currently using ghcr.io/adoptium/test-containers:ubuntu2204 which is a lightweight image built loosely off the dockerStatic base images. PR incoming to the infra repo to regularly build this and we can add more base images where appropriate.

Grinder to show this working: https://ci.adoptium.net/view/Test_grinder/job/Grinder/11160/

@smlambert
Copy link
Contributor

smlambert commented Oct 10, 2024

Removing ADDITIONAL_LABEL values from Grinder run and filling in CLOUD_PROVIDER=azure, since that is how the tests are expected to be launched in general pipeline code:

Run some additional grinders, varying test group and JDK version:

sanity.system, JDK21:

special.functional, JDK11:

17:05:11  Running on test-linux-x64-b97844 in /home/adoptopenjdk/workspace/Grinder_testList_3
[Pipeline] {
...
17:06:14  Running tests...
[Pipeline] echo
17:06:14  ITERATION: 1/1
[Pipeline] wrap
17:06:14  $ Xvfb -displayfd 2 -screen 0 1024x768x24 -fbdir /home/adoptopenjdk/workspace/Grinder_testList_3/.xvfb-10-..fbdir4319951883012337092
...
17:06:18  Exception: java.io.IOException: Cannot run program "Xvfb": error=2, No such file or directory
...
12:42:52       [exec] The test in the build_image() function is jacoco
12:42:52       [exec] #####################################################
12:42:52       [exec] INFO:  docker build  --no-cache -t adoptopenjdk-jacoco-test:17-jdk-ubuntu-hotspot-full -f /home/adoptopenjdk/workspace/Grinder/jvmtest/external/jacoco/dockerfile/17/jdk/ubuntu/Dockerfile.hotspot.full /home/adoptopenjdk/workspace/Grinder/jvmtest/external/
12:42:52       [exec] #####################################################
12:42:52       [exec] /home/adoptopenjdk/workspace/Grinder/aqa-tests/TKG/../../jvmtest/external/build_image.sh: line 75: docker: command not found
12:42:52  
12:42:52  BUILD FAILED

@gdams
Copy link
Member Author

gdams commented Oct 11, 2024

Exception: java.io.IOException: Cannot run program "Xvfb": error=2, No such file or directory

@ShelleyLambert looking at that job it didn't run in a docker container? I think it's because we need to add the ci.agent.dynamic label to that job?

@gdams
Copy link
Member Author

gdams commented Oct 11, 2024

Rebuilding https://ci.adoptium.net/view/Test_grinder/job/Grinder/11171/ (PARALLEL=Dynamic, NUM_MACHINES=5)

@smlambert
Copy link
Contributor

smlambert commented Oct 11, 2024

@ShelleyLambert looking at that job it didn't run in a docker container? I think it's because we need to add the ci.agent.dynamic label to that job?

Yes, I am well-aware and we should take that machine offline and raise an infra issue.

Looking more closely, it looks like it found a static machine, but still ran on a dynamic machine (test-linux-x64-b97844), console output from https://ci.adoptium.net/view/Test_grinder/job/Grinder_testList_3/10/console

17:05:10  Found a total of 30 nodes with the 'ci.role.test&&hw.arch.x86&&sw.os.linux' label
[Pipeline] echo
17:05:10  Found an idle node: test-docker-debian12-x64-4. The program will not start dynamic vm.
[Pipeline] }
[Pipeline] // node
[Pipeline] node
17:05:11  Running on [test-linux-x64-b97844](https://ci.adoptium.net/computer/test%2Dlinux%2Dx64%2Db97844/) in /home/adoptopenjdk/workspace/Grinder_testList_3
[Pipeline] {
[Pipeline] retry
[Pipeline] {
[Pipeline] timeout
17:05:11  Timeout set to expire in 1 hr 0 min

In 'real' pipeline runs, the test pipeline code will first try to send to idle static machines and spin up dynamic ones as needed after that.

As per the current design, we will not be passing in ci.agent.dynamic explicitly when we trigger test pipelines (only updating to set CLOUD_PROVIDER=azure), so I wanted to check that it works as designed.

@gdams
Copy link
Member Author

gdams commented Oct 11, 2024

Right okay, so if you don't expect the dynamic label to be used I'll have to tweak the existing code slightly, I'll have a playb

@gdams
Copy link
Member Author

gdams commented Oct 11, 2024

@smlambert I've updated the code so it won't explicitly require someone to pass the dynamic label to the job anymore. As long as the cloud is set as Azure it will default to using a container image

@smlambert
Copy link
Contributor

@gdams - you do not need to make the change on L461. I was explaining how the current logic already works, not asking for you to change your PR, which looks fine as it is, I am just running many additional tests to verify that each test group works on these agents (please see what we do on L351).

@gdams
Copy link
Member Author

gdams commented Oct 12, 2024

@gdams - you do not need to make the change on L461. I was explaining how the current logic already works, not asking for you to change your PR, which looks fine as it is, I am just running many additional tests to verify that each test group works on these agents (please see what we do on L351).

reverted PTAL

@smlambert
Copy link
Contributor

Couple extra Grinder runs to verify:
without CLOUD_PROVIDER: https://ci.adoptium.net/view/Test_grinder/job/Grinder/11320/
with CLOUD_PROVIDER=azure https://ci.adoptium.net/view/Test_grinder/job/Grinder/11321/

@smlambert
Copy link
Contributor

smlambert commented Nov 12, 2024

https://ci.adoptium.net/view/Test_grinder/job/Grinder/11321/ fails with:

13:03:02  + ./get.sh -s /home/jenkins/workspace/Grinder/aqa-tests/.. -p x86-64_linux -r nightly -j 21 -i hotspot --clone_openj9 false --tkg_repo https://github.com/AdoptOpenJDK/TKG.git --tkg_branch master
13:03:02  TESTDIR: /home/jenkins/workspace/Grinder/aqa-tests
13:03:02  get jdk binary...
13:03:02  _ENCODE_FILE_NEW=UNTAGGED curl -OLJSks  https://api.adoptium.net/v3/binary/latest/21/ea/linux/x64/jdk/hotspot/normal/adoptium?project=jdk
13:03:05  _ENCODE_FILE_NEW=UNTAGGED curl -OLJSks  https://api.adoptium.net/v3/binary/latest/21/ea/linux/x64/sbom/hotspot/normal/adoptium?project=jdk
13:03:06  _ENCODE_FILE_NEW=UNTAGGED curl -OLJSks  https://api.adoptium.net/v3/binary/latest/21/ea/linux/x64/testimage/hotspot/normal/adoptium?project=jdk
13:03:09  Uncompressing file: OpenJDK21U-jdk_x64_linux_hotspot_21.0.6_2-ea.tar.gz ...
13:03:13  Uncompressing file: OpenJDK21U-testimage_x64_linux_hotspot_21.0.6_2-ea.tar.gz ...
13:03:16  Run /home/jenkins/workspace/Grinder/jdkbinary/j2sdk-image/bin/java -version
13:03:16  =JAVA VERSION OUTPUT BEGIN=
13:03:16  /home/jenkins/workspace/Grinder/jdkbinary/j2sdk-image/bin/java: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /home/jenkins/workspace/Grinder/jdkbinary/j2sdk-image/bin/../lib/libjli.so)
[Pipeline] }

This failure is unrelated to this PR (related to: #5754)

@smlambert
Copy link
Contributor

smlambert commented Nov 13, 2024

I will launch a couple of Grinders to 'exhaust' our list of static nodes, and see the use of dynamic agents shortly.

docker.image('adoptopenjdk/centos7_build_image').pull()
docker.image('adoptopenjdk/centos7_build_image').inside {
// Set dockerimage for azure agent. Fyre has stencil to setup the right environment
docker.image('ghcr.io/adoptium/test-containers:ubuntu2204').pull()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we extract that URL or the ubuntu version?

Copy link
Contributor

@smlambert smlambert Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referencing a set of containers from https://github.com/orgs/adoptium/packages/container/package/test-containers is good from my point of view (they have the test prereqs), and we can build out several to give us a variety of coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants