Skip to content

Conversation

@dependabot
Copy link

@dependabot dependabot bot commented on behalf of github Oct 28, 2022

Bumps raydp-nightly from 2022.6.30.dev1 to 2022.10.28.dev1.

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [raydp-nightly](https://github.com/oap-project/raydp) from 2022.6.30.dev1 to 2022.10.28.dev1.
- [Release notes](https://github.com/oap-project/raydp/releases)
- [Commits](https://github.com/oap-project/raydp/commits)

---
updated-dependencies:
- dependency-name: raydp-nightly
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot @github
Copy link
Author

dependabot bot commented on behalf of github Oct 28, 2022

Dependabot tried to add @scv119 and @clarkzinzow as reviewers to this PR, but received the following error from GitHub:

POST https://api.github.com/repos/bveeramani/ray/pulls/2/requested_reviewers: 422 - Reviews may only be requested from collaborators. One or more of the users or teams you specified is not a collaborator of the bveeramani/ray repository. // See: https://docs.github.com/rest/reference/pulls#request-reviewers-for-a-pull-request

@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Oct 28, 2022
@dependabot @github
Copy link
Author

dependabot bot commented on behalf of github Nov 5, 2022

Superseded by #10.

@dependabot dependabot bot closed this Nov 5, 2022
@dependabot dependabot bot deleted the dependabot/pip/python/requirements/data_processing/raydp-nightly-2022.10.28.dev1 branch November 5, 2022 07:02
bveeramani pushed a commit that referenced this pull request Apr 27, 2023
Why are these changes needed?

Right now the theory is as follow.

pubsub io service is created and run inside the GcsServer. That means if pubsub io service is accessed after GCSServer GC'ed, it will segfault.
Right now, upon teardown, when we call rpc::DrainAndResetExecutor, this will recreate the Executor thread pool.
Upon teardown, If DrainAndResetExecutor -> GcsServer's internal pubsub posts new SendReply to the newly created threadpool -> GcsServer.reset -> pubsub io service GC'ed -> SendReply invoked from the newly created thread pool, it will segfault.
NOTE: the segfault is from pubsub service if you see the failure

#2 0x7f92034d9129 in ray::rpc::ServerCallImpl<ray::rpc::InternalPubSubGcsServiceHandler, ray::rpc::GcsSubscriberPollRequest, ray::rpc::GcsSubscriberPollReply>::HandleRequestImpl()::'lambda'(ray::Status, std::__1::function<void ()>, std::__1::function<void ()>)::operator()(ray::Status, std::__1::function<void ()>, std::__1::function<void ()>) const::'lambda'()::operator()() const /proc/self/cwd/bazel-out/k8-opt/bin/_virtual_includes/grpc_common_lib/ray/rpc/server_call.h:212:48
As a fix, I only drain the thread pool. And then reset it after all operations are fully cleaned up (only from tests). I think there's no need to reset for regular proc termination like raylet, gcs, core workers.

Related issue number

Closes ray-project#34344

Signed-off-by: SangBin Cho <rkooo567@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant