Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 13, 2024

What changes were proposed in this pull request?

This PR aims to use mirror host instead of archive.apache.org.

Why are the changes needed?

Currently, Apache Spark CI is flaky due to the checksum download failure like the following. It took over 9 minutes and failed eventually.

exec: curl --retry 3 --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz?action=download
exec: curl --retry 3 --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz.sha512
curl: (28) Failed to connect to archive.apache.org port 443 after 135199 ms: Connection timed out
curl: (28) Failed to connect to archive.apache.org port 443 after 134166 ms: Connection timed out
curl: (28) Failed to connect to archive.apache.org port 443 after 135213 ms: Connection timed out
curl: (28) Failed to connect to archive.apache.org port 443 after 135260 ms: Connection timed out
Verifying checksum from /home/runner/work/spark/spark/build/apache-maven-3.9.9-bin.tar.gz.sha512
shasum: /home/runner/work/spark/spark/build/apache-maven-3.9.9-bin.tar.gz.sha512: no properly formatted SHA checksum lines found
Bad checksum from https://archive.apache.org/dist/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz.sha512
Error: Process completed with exit code 2.

BEFORE

$ build/mvn clean
exec: curl --retry 3 --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz?action=download
exec: curl --retry 3 --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz.sha512

AFTER

$ build/mvn clean
exec: curl --retry 3 --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz?action=download
exec: curl --retry 3 --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz.sha512?action=download

Does this PR introduce any user-facing change?

No, this is a dev-only change.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the BUILD label Nov 13, 2024
@dongjoon-hyun
Copy link
Member Author

Could you review this PR, @huaxingao ?

Copy link
Contributor

@huaxingao huaxingao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @dongjoon-hyun

@dongjoon-hyun
Copy link
Member Author

Thank you, @huaxingao . Merged to master/3.5.

dongjoon-hyun added a commit that referenced this pull request Nov 13, 2024
### What changes were proposed in this pull request?

This PR aims to use `mirror host` instead of `archive.apache.org`.

### Why are the changes needed?

Currently, Apache Spark CI is flaky due to the checksum download failure like the following. It took over 9 minutes and failed eventually.

- https://github.com/apache/spark/actions/runs/11818847971/job/32927380452
- https://github.com/apache/spark/actions/runs/11818847971/job/32927382179
```
exec: curl --retry 3 --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz?action=download
exec: curl --retry 3 --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz.sha512
curl: (28) Failed to connect to archive.apache.org port 443 after 135199 ms: Connection timed out
curl: (28) Failed to connect to archive.apache.org port 443 after 134166 ms: Connection timed out
curl: (28) Failed to connect to archive.apache.org port 443 after 135213 ms: Connection timed out
curl: (28) Failed to connect to archive.apache.org port 443 after 135260 ms: Connection timed out
Verifying checksum from /home/runner/work/spark/spark/build/apache-maven-3.9.9-bin.tar.gz.sha512
shasum: /home/runner/work/spark/spark/build/apache-maven-3.9.9-bin.tar.gz.sha512: no properly formatted SHA checksum lines found
Bad checksum from https://archive.apache.org/dist/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz.sha512
Error: Process completed with exit code 2.
```

**BEFORE**
```
$ build/mvn clean
exec: curl --retry 3 --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz?action=download
exec: curl --retry 3 --silent --show-error -L https://archive.apache.org/dist/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz.sha512
```

**AFTER**
```
$ build/mvn clean
exec: curl --retry 3 --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz?action=download
exec: curl --retry 3 --silent --show-error -L https://www.apache.org/dyn/closer.lua/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.tar.gz.sha512?action=download
```

### Does this PR introduce _any_ user-facing change?

No, this is a dev-only change.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #48836 from dongjoon-hyun/SPARK-50300.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 5cc60f4)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun dongjoon-hyun deleted the SPARK-50300 branch November 13, 2024 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants