Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Speed up wheel commit validation check by 100x #33434

Merged
merged 4 commits into from
Mar 20, 2023
Merged

[CI] Speed up wheel commit validation check by 100x #33434

merged 4 commits into from
Mar 20, 2023

Conversation

can-anyscale
Copy link
Collaborator

@can-anyscale can-anyscale commented Mar 18, 2023

Why are these changes needed?

Speed up wheel commit validation check by 100x. Also hopefully will alleviate if not eliminate the 'Observed wheel commit () is not expected' issue (#32156) that has been creeping through many of ci/cd builds in our pipeline.

The existing code uses pipe to read from a rather large file (>50MB). Pipe however has buffer limit which by default in term of kb (https://man7.org/linux/man-pages/man7/pipe.7.html) so what we look for might not exist. We can fix this by tell unzip the exact file we are looking for. That file is pretty small so we should not hit buffer limit.

You might notice other surpises might still happen with this fix (e.g. many files that match ^commit). This sanity check goes back to 2 years ago by our veteran Kai (234b015) to sanity check issues with stale artifacts from previous builds or race conditions between builds. Further investigation on how builkite agent multi-tenant is setup might or might not simplify this logic further.

Signed-off-by: Cuong Nguyen can@anyscale.com
@can-anyscale

Related issue number

#32156

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • [] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • CI Tests

…s been creeping through many of ci/cd builds in our pipeline.

The existing code uses pipe to read from a rather large file (>50MB). Pipe however has buffer limit which by default in term of kb (https://man7.org/linux/man-pages/man7/pipe.7.html) so what we look for might not exist. We can fix this by tell unzip the exact file we are looking for. That file is pretty small so we should not hit buffer limit.

You might notice other surpises might still happen with this fix (e.g. many files that match ^__commit__). This sanity check goes back to 2 years ago by our veteran Kai (234b015) to sanity check issues with stale artifacts from previous builds or race conditions between builds. Further investigation on how builkite agent multi-tenant is setup might or might not simplify this logic further.

Signed-off-by: Cuong Nguyen <can@anyscale.com>
@aslonnie
Copy link
Collaborator

linux pipe is a streaming setup, right? why the buffer size matters?

@aslonnie
Copy link
Collaborator

are you able to reproduce the issue and find out the root cause?

@can-anyscale
Copy link
Collaborator Author

@aslonnie good point, and you're absolutely correct, the stdin is stream; purely a theory but my best guess is streaming such a large stdin makes buffer and grep unpredictable, and hopefully no harm to reduce the data usage a bit

I can find and download the problematic wheel (e.g. see artifacts and download cp39 in https://buildkite.com/ray-project/oss-ci-build-branch/builds/2791#0186ee52-dbab-4169-839c-f8eab80791be), but running the same grep command locally returns a proper commit, so the wheel is ok, but the empty commit return is not reproducible

unzip -p ray-3.0.0.dev0-cp39-cp39-manylinux2014_aarch64.whl | grep "^commit" | awk '-F"' '{print $2}'
03a9d21

unzip -p ray-3.0.0.dev0-cp39-cp39-manylinux2014_aarch64.whl "*/ray/init.py" | grep "^commit" | awk '-F"' '{print $2}'
03a9d21

The second command is also much (100x) faster

Copy link
Member

@cadedaniel cadedaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sus of the hypothesis that large files would cause grep to fail. But the speed up is great 😀 maybe change PR title to something like "speed up commit extraction from wheel"?

ci/ci.sh Outdated
@@ -415,7 +415,7 @@ validate_wheels_commit_str() {
continue
fi

WHL_COMMIT=$(unzip -p "$whl" | grep "^__commit__" | awk -F'"' '{print $2}')
WHL_COMMIT=$(unzip -p "$whl" "*ray/__init__.py" | grep "^__commit__" | awk -F'"' '{print $2}')

if [ "${WHL_COMMIT}" != "${EXPECTED_COMMIT}" ]; then
echo "Error: Observed wheel commit (${WHL_COMMIT}) is not expected commit (${EXPECTED_COMMIT}). Aborting."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a log to print which wheel file it came from?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely!

@can-anyscale can-anyscale changed the title Fix 'Observed wheel commit () is not expected' issue Speed up wheel commit validation check by 100x Mar 20, 2023
@cadedaniel
Copy link
Member

cc @jjyao

@cadedaniel cadedaniel changed the title Speed up wheel commit validation check by 100x [CI] Speed up wheel commit validation check by 100x Mar 20, 2023
@jjyao jjyao merged commit 52f0ee8 into ray-project:master Mar 20, 2023
@can-anyscale
Copy link
Collaborator Author

w00h00 thank you both!

edoakes pushed a commit to edoakes/ray that referenced this pull request Mar 22, 2023
Speed up wheel commit validation check by 100x. Also hopefully will alleviate if not eliminate the 'Observed wheel commit () is not expected' issue (ray-project#32156) that has been creeping through many of ci/cd builds in our pipeline.

The existing code uses pipe to read from a rather large file (>50MB). Pipe however has buffer limit which by default in term of kb (https://man7.org/linux/man-pages/man7/pipe.7.html) so what we look for might not exist. We can fix this by tell unzip the exact file we are looking for. That file is pretty small so we should not hit buffer limit.

Signed-off-by: Cuong Nguyen <can@anyscale.com>
Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
clarng pushed a commit to clarng/ray that referenced this pull request Mar 23, 2023
Speed up wheel commit validation check by 100x. Also hopefully will alleviate if not eliminate the 'Observed wheel commit () is not expected' issue (ray-project#32156) that has been creeping through many of ci/cd builds in our pipeline.

The existing code uses pipe to read from a rather large file (>50MB). Pipe however has buffer limit which by default in term of kb (https://man7.org/linux/man-pages/man7/pipe.7.html) so what we look for might not exist. We can fix this by tell unzip the exact file we are looking for. That file is pretty small so we should not hit buffer limit.

Signed-off-by: Cuong Nguyen <can@anyscale.com>
elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023
Speed up wheel commit validation check by 100x. Also hopefully will alleviate if not eliminate the 'Observed wheel commit () is not expected' issue (ray-project#32156) that has been creeping through many of ci/cd builds in our pipeline.

The existing code uses pipe to read from a rather large file (>50MB). Pipe however has buffer limit which by default in term of kb (https://man7.org/linux/man-pages/man7/pipe.7.html) so what we look for might not exist. We can fix this by tell unzip the exact file we are looking for. That file is pretty small so we should not hit buffer limit.

Signed-off-by: Cuong Nguyen <can@anyscale.com>
Signed-off-by: elliottower <elliot@elliottower.com>
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
Speed up wheel commit validation check by 100x. Also hopefully will alleviate if not eliminate the 'Observed wheel commit () is not expected' issue (ray-project#32156) that has been creeping through many of ci/cd builds in our pipeline.

The existing code uses pipe to read from a rather large file (>50MB). Pipe however has buffer limit which by default in term of kb (https://man7.org/linux/man-pages/man7/pipe.7.html) so what we look for might not exist. We can fix this by tell unzip the exact file we are looking for. That file is pretty small so we should not hit buffer limit.

Signed-off-by: Cuong Nguyen <can@anyscale.com>
Signed-off-by: Jack He <jackhe2345@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants