-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
What happened?
We have seen recently some flakiness in a few workflows (e.g. https://github.com/apache/beam/actions/runs/20245962422/job/58125988587, #30525 (comment), https://github.com/apache/beam/actions/runs/19710817239).
The following error can be seen if such workflows failed:
Current runner version: '2.318.0'
...
Error: System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values. (Parameter ''using: node24' is not supported, use 'docker', 'node12', 'node16' or 'node20' instead.')
at GitHub.Runner.Worker.ActionManifestManager.ConvertRuns(IExecutionContext executionContext, TemplateContext templateContext, TemplateToken inputsToken, String fileRelativePath, MappingToken outputs)
at GitHub.Runner.Worker.ActionManifestManager.Load(IExecutionContext executionContext, String manifestFile)
Error: Failed to load gradle/actions/4d9f0ba0025fe599b4ebab900eb7f3a1d93ef4c2/setup-gradle/action.yml
The reason is that node24 is added at 2.327.1, so it failed if an older github runner is used: https://github.com/actions/runner/releases/tag/v2.327.1.
Looking at the log, I see that there is actually a updating step when the runner started. If it is successful, we will see:
However, when the github workflow is run before the runner finished updating, we see
After some digging, I see the ARC images for apache-beam-testing were built via github workflow: https://github.com/apache/beam/actions/workflows/build_runner_image.yml and stored at https://pantheon.corp.google.com/artifacts/docker/apache-beam-testing/us-central1/beam-github-actions/beam-arc-runner.
However, we are still pinning an old version (3063b55757509dad1c14751c9f2aa5905826d9a0), which was created at Aug 14, 2024. I verified that the runner version inside this docker image is 2.318.0.
| runner_image = "us-central1-docker.pkg.dev/apache-beam-testing/beam-github-actions/beam-arc-runner:3063b55757509dad1c14751c9f2aa5905826d9a0" |
I think we should update our config to use a newer version of the image. Thoughts?
Issue Failure
Failure: Test is flaky
Issue Priority
Priority: 1 (unhealthy code / failing or flaky postcommit so we cannot be sure the product is healthy)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner