Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][C++] Recurrent failures downloading ORC from download.apache.org #44985

Closed
pitrou opened this issue Dec 10, 2024 · 3 comments
Closed

[CI][C++] Recurrent failures downloading ORC from download.apache.org #44985

pitrou opened this issue Dec 10, 2024 · 3 comments

Comments

@pitrou
Copy link
Member

pitrou commented Dec 10, 2024

Describe the bug, including details regarding any error messages, version, and platform.

We have recently starting to see very frequent download errors for the ORC library. Example:
https://github.com/apache/arrow/actions/runs/12245621761/job/34159883827

-- Downloading...
   dst='/Users/runner/work/arrow/arrow/build/cpp/_deps/orc-build/orc-format_ep-prefix/src/orc-format-1.0.0.tar.gz'
   timeout='none'
   inactivity timeout='none'
-- Using src='https://archive.apache.org/dist/orc/orc-format-1.0.0/orc-format-1.0.0.tar.gz'
-- Retrying...
-- Using src='https://archive.apache.org/dist/orc/orc-format-1.0.0/orc-format-1.0.0.tar.gz'
-- Retry after 5 seconds (attempt #2) ...
-- Using src='https://archive.apache.org/dist/orc/orc-format-1.0.0/orc-format-1.0.0.tar.gz'
-- Retry after 5 seconds (attempt #3) ...
-- Using src='https://archive.apache.org/dist/orc/orc-format-1.0.0/orc-format-1.0.0.tar.gz'
-- Retry after 15 seconds (attempt #4) ...
-- Using src='https://archive.apache.org/dist/orc/orc-format-1.0.0/orc-format-1.0.0.tar.gz'
-- Retry after 60 seconds (attempt #5) ...
-- Using src='https://archive.apache.org/dist/orc/orc-format-1.0.0/orc-format-1.0.0.tar.gz'
CMake Error at orc-format_ep-stamp/download-orc-format_ep.cmake:163 (message):
  Each download failed!

    error: downloading 'https://archive.apache.org/dist/orc/orc-format-1.0.0/orc-format-1.0.0.tar.gz' failed
          status_code: 28
          status_string: "Timeout was reached"
          log:
          --- LOG BEGIN ---
          Host archive.apache.org:443 was resolved.

https://archive.apache.org/ indexes all past releases but discourages heavy use. We're probably hitting a rate limiter and being temporarily banned because of that.

Component(s)

C++, Continuous Integration

pitrou added a commit to pitrou/arrow that referenced this issue Dec 10, 2024
pitrou added a commit to pitrou/arrow that referenced this issue Dec 10, 2024
pitrou added a commit that referenced this issue Dec 10, 2024
…4977)

### Rationale for this change

https://archive.apache.org/ is not suitable as the download primary location for CI builds, as it has strict rate limits and can cause download timeouts if the limit is reached.

### What changes are included in this PR?

Use the [recommended download addresses](https://infra.apache.org/release-download-pages.html#download-page) and fall back on the ASF CDN.

### Are these changes tested?

Yes, by CI.

### Are there any user-facing changes?

No.

* GitHub Issue: #44985

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
@pitrou pitrou added this to the 19.0.0 milestone Dec 10, 2024
@pitrou
Copy link
Member Author

pitrou commented Dec 10, 2024

Issue resolved by pull request 44977
#44977

@pitrou pitrou closed this as completed Dec 10, 2024
@NickCrews
Copy link
Contributor

NickCrews commented Dec 14, 2024

@pitrou
Copy link
Member Author

pitrou commented Dec 14, 2024

@NickCrews Yes, this will be fully solved with the next ORC release (see apache/orc#1830)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants