Skip to content

Conversation

@israbbani
Copy link
Contributor

@israbbani israbbani commented Nov 13, 2025

If you make concurrent ray.get requests from the same worker for the same object, you will hit a critical failure. The issue was reported in #58394.

#57911 fixed the bug where multiple ray.get requests from the same worker for different objects would lead to some workers hanging.

The unit test I added fails consistently without the fix:

RUN      ] LeaseDependencyManagerTest.TestCancelingMultipleGetRequestsForSameObjectForWorker
[2025-11-14 00:26:02,651 C 3875444 3875444] lease_dependency_manager.cc:160:  An unexpected system state has occurred. You have likely discovered a bug in Ray. Please report this issue at https://github.com/ray-project/ray/issues and we'll work with you to fix it. Check failed: obj_iter != required_objects_.end() 
*** StackTrace Information ***
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3raylsERSoRKNS_10StackTraceE+0x38) [0x7267bdd08e38] ray::operator<<()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3ray6RayLogD1Ev+0x67) [0x7267bdd0ca17] ray::RayLog::~RayLog()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sraylet_Sliblease_Udependency_Umanager.so(_ZN3ray6raylet22LeaseDependencyManager16CancelGetRequestERKNS_8WorkerIDERKl+0x1cf) [0x7267c11b096f] ray::raylet::LeaseDependencyManager::CancelGetRequest()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x200bc) [0x5b4343a100bc] ray::raylet::LeaseDependencyManagerTest_TestCancelingMultipleGetRequestsForSameObjectForWorker_Test::TestBody()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS_4TestEvEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdab81d4] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing4Test3RunEv+0x1f1) [0x7267bdab8111] testing::Test::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8TestInfo3RunEv+0x23f) [0x7267bdab938f] testing::TestInfo::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing9TestSuite3RunEv+0x307) [0x7267bdaba207] testing::TestSuite::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal12UnitTestImpl11RunAllTestsEv+0x577) [0x7267bdacb5a7] testing::internal::UnitTestImpl::RunAllTests()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS0_12UnitTestImplEbEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdacae64] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8UnitTest3RunEv+0x6b) [0x7267bdacacfb] testing::UnitTest::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(main+0x21) [0x5b4343a13c81] main
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7267bd229d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7267bd229e40] __libc_start_main
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x1af55) [0x5b4343a0af55] _start

It passes consistently after the fix.

object from the same worker thread-safe.

Signed-off-by: irabbani <irabbani@anyscale.com>
@israbbani israbbani added the go add ONLY when ready to merge, run all tests label Nov 13, 2025
Signed-off-by: irabbani <irabbani@anyscale.com>
/// object.
std::unordered_set<WorkerID> dependent_get_requests;
/// object and the count of outstanding get_requests per worker.
std::unordered_map<WorkerID, int64_t> dependent_get_requests;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this map is redundant with all of the other bookkeeping inside LeaseDependencyManager. I'm debating refactoring the entire class .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true. do it!

@israbbani israbbani marked this pull request as ready for review November 14, 2025 00:40
@israbbani israbbani requested a review from a team as a code owner November 14, 2025 00:40
@ray-gardener ray-gardener bot added the core Issues that should be addressed in Ray Core label Nov 14, 2025
@edoakes edoakes merged commit 33d6316 into master Nov 14, 2025
6 checks passed
@edoakes edoakes deleted the irabbani/ray-get-3 branch November 14, 2025 17:08
ArturNiederfahrenhorst pushed a commit to ArturNiederfahrenhorst/ray that referenced this pull request Nov 16, 2025
…object from the same worker thread-safe. (ray-project#58606)

If you make concurrent ray.get requests from the same worker for the
same object, you will hit a critical failure. The issue was reported in
ray-project#58394.

ray-project#57911 fixed the bug where multiple ray.get requests from the same
worker for different objects would lead to some workers hanging.

The unit test I added fails consistently without the fix:
```
RUN      ] LeaseDependencyManagerTest.TestCancelingMultipleGetRequestsForSameObjectForWorker
[2025-11-14 00:26:02,651 C 3875444 3875444] lease_dependency_manager.cc:160:  An unexpected system state has occurred. You have likely discovered a bug in Ray. Please report this issue at https://github.com/ray-project/ray/issues and we'll work with you to fix it. Check failed: obj_iter != required_objects_.end() 
*** StackTrace Information ***
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3raylsERSoRKNS_10StackTraceE+0x38) [0x7267bdd08e38] ray::operator<<()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3ray6RayLogD1Ev+0x67) [0x7267bdd0ca17] ray::RayLog::~RayLog()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sraylet_Sliblease_Udependency_Umanager.so(_ZN3ray6raylet22LeaseDependencyManager16CancelGetRequestERKNS_8WorkerIDERKl+0x1cf) [0x7267c11b096f] ray::raylet::LeaseDependencyManager::CancelGetRequest()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x200bc) [0x5b4343a100bc] ray::raylet::LeaseDependencyManagerTest_TestCancelingMultipleGetRequestsForSameObjectForWorker_Test::TestBody()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS_4TestEvEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdab81d4] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing4Test3RunEv+0x1f1) [0x7267bdab8111] testing::Test::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8TestInfo3RunEv+0x23f) [0x7267bdab938f] testing::TestInfo::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing9TestSuite3RunEv+0x307) [0x7267bdaba207] testing::TestSuite::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal12UnitTestImpl11RunAllTestsEv+0x577) [0x7267bdacb5a7] testing::internal::UnitTestImpl::RunAllTests()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS0_12UnitTestImplEbEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdacae64] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8UnitTest3RunEv+0x6b) [0x7267bdacacfb] testing::UnitTest::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(main+0x21) [0x5b4343a13c81] main
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7267bd229d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7267bd229e40] __libc_start_main
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x1af55) [0x5b4343a0af55] _start
```

It passes consistently after the fix.

---------

Signed-off-by: irabbani <irabbani@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…object from the same worker thread-safe. (ray-project#58606)

If you make concurrent ray.get requests from the same worker for the
same object, you will hit a critical failure. The issue was reported in
ray-project#58394.

ray-project#57911 fixed the bug where multiple ray.get requests from the same
worker for different objects would lead to some workers hanging.

The unit test I added fails consistently without the fix:
```
RUN      ] LeaseDependencyManagerTest.TestCancelingMultipleGetRequestsForSameObjectForWorker
[2025-11-14 00:26:02,651 C 3875444 3875444] lease_dependency_manager.cc:160:  An unexpected system state has occurred. You have likely discovered a bug in Ray. Please report this issue at https://github.com/ray-project/ray/issues and we'll work with you to fix it. Check failed: obj_iter != required_objects_.end()
*** StackTrace Information ***
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3raylsERSoRKNS_10StackTraceE+0x38) [0x7267bdd08e38] ray::operator<<()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3ray6RayLogD1Ev+0x67) [0x7267bdd0ca17] ray::RayLog::~RayLog()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sraylet_Sliblease_Udependency_Umanager.so(_ZN3ray6raylet22LeaseDependencyManager16CancelGetRequestERKNS_8WorkerIDERKl+0x1cf) [0x7267c11b096f] ray::raylet::LeaseDependencyManager::CancelGetRequest()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x200bc) [0x5b4343a100bc] ray::raylet::LeaseDependencyManagerTest_TestCancelingMultipleGetRequestsForSameObjectForWorker_Test::TestBody()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS_4TestEvEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdab81d4] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing4Test3RunEv+0x1f1) [0x7267bdab8111] testing::Test::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8TestInfo3RunEv+0x23f) [0x7267bdab938f] testing::TestInfo::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing9TestSuite3RunEv+0x307) [0x7267bdaba207] testing::TestSuite::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal12UnitTestImpl11RunAllTestsEv+0x577) [0x7267bdacb5a7] testing::internal::UnitTestImpl::RunAllTests()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS0_12UnitTestImplEbEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdacae64] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8UnitTest3RunEv+0x6b) [0x7267bdacacfb] testing::UnitTest::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(main+0x21) [0x5b4343a13c81] main
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7267bd229d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7267bd229e40] __libc_start_main
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x1af55) [0x5b4343a0af55] _start
```

It passes consistently after the fix.

---------

Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
ykdojo pushed a commit to ykdojo/ray that referenced this pull request Nov 27, 2025
…object from the same worker thread-safe. (ray-project#58606)

If you make concurrent ray.get requests from the same worker for the
same object, you will hit a critical failure. The issue was reported in
ray-project#58394.

ray-project#57911 fixed the bug where multiple ray.get requests from the same
worker for different objects would lead to some workers hanging.

The unit test I added fails consistently without the fix:
```
RUN      ] LeaseDependencyManagerTest.TestCancelingMultipleGetRequestsForSameObjectForWorker
[2025-11-14 00:26:02,651 C 3875444 3875444] lease_dependency_manager.cc:160:  An unexpected system state has occurred. You have likely discovered a bug in Ray. Please report this issue at https://github.com/ray-project/ray/issues and we'll work with you to fix it. Check failed: obj_iter != required_objects_.end()
*** StackTrace Information ***
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3raylsERSoRKNS_10StackTraceE+0x38) [0x7267bdd08e38] ray::operator<<()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3ray6RayLogD1Ev+0x67) [0x7267bdd0ca17] ray::RayLog::~RayLog()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sraylet_Sliblease_Udependency_Umanager.so(_ZN3ray6raylet22LeaseDependencyManager16CancelGetRequestERKNS_8WorkerIDERKl+0x1cf) [0x7267c11b096f] ray::raylet::LeaseDependencyManager::CancelGetRequest()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x200bc) [0x5b4343a100bc] ray::raylet::LeaseDependencyManagerTest_TestCancelingMultipleGetRequestsForSameObjectForWorker_Test::TestBody()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS_4TestEvEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdab81d4] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing4Test3RunEv+0x1f1) [0x7267bdab8111] testing::Test::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8TestInfo3RunEv+0x23f) [0x7267bdab938f] testing::TestInfo::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing9TestSuite3RunEv+0x307) [0x7267bdaba207] testing::TestSuite::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal12UnitTestImpl11RunAllTestsEv+0x577) [0x7267bdacb5a7] testing::internal::UnitTestImpl::RunAllTests()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS0_12UnitTestImplEbEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdacae64] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8UnitTest3RunEv+0x6b) [0x7267bdacacfb] testing::UnitTest::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(main+0x21) [0x5b4343a13c81] main
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7267bd229d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7267bd229e40] __libc_start_main
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x1af55) [0x5b4343a0af55] _start
```

It passes consistently after the fix.

---------

Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: YK <1811651+ykdojo@users.noreply.github.com>
SheldonTsen pushed a commit to SheldonTsen/ray that referenced this pull request Dec 1, 2025
…object from the same worker thread-safe. (ray-project#58606)

If you make concurrent ray.get requests from the same worker for the
same object, you will hit a critical failure. The issue was reported in
ray-project#58394.

ray-project#57911 fixed the bug where multiple ray.get requests from the same
worker for different objects would lead to some workers hanging.

The unit test I added fails consistently without the fix:
```
RUN      ] LeaseDependencyManagerTest.TestCancelingMultipleGetRequestsForSameObjectForWorker
[2025-11-14 00:26:02,651 C 3875444 3875444] lease_dependency_manager.cc:160:  An unexpected system state has occurred. You have likely discovered a bug in Ray. Please report this issue at https://github.com/ray-project/ray/issues and we'll work with you to fix it. Check failed: obj_iter != required_objects_.end() 
*** StackTrace Information ***
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3raylsERSoRKNS_10StackTraceE+0x38) [0x7267bdd08e38] ray::operator<<()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sutil_Sliblogging.so(_ZN3ray6RayLogD1Ev+0x67) [0x7267bdd0ca17] ray::RayLog::~RayLog()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libsrc_Sray_Sraylet_Sliblease_Udependency_Umanager.so(_ZN3ray6raylet22LeaseDependencyManager16CancelGetRequestERKNS_8WorkerIDERKl+0x1cf) [0x7267c11b096f] ray::raylet::LeaseDependencyManager::CancelGetRequest()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x200bc) [0x5b4343a100bc] ray::raylet::LeaseDependencyManagerTest_TestCancelingMultipleGetRequestsForSameObjectForWorker_Test::TestBody()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS_4TestEvEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdab81d4] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing4Test3RunEv+0x1f1) [0x7267bdab8111] testing::Test::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8TestInfo3RunEv+0x23f) [0x7267bdab938f] testing::TestInfo::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing9TestSuite3RunEv+0x307) [0x7267bdaba207] testing::TestSuite::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal12UnitTestImpl11RunAllTestsEv+0x577) [0x7267bdacb5a7] testing::internal::UnitTestImpl::RunAllTests()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS0_12UnitTestImplEbEET0_PT_MS4_FS3_vEPKc+0x54) [0x7267bdacae64] testing::internal::HandleExceptionsInMethodIfSupported<>()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/../../../../_solib_k8/libexternal_Scom_Ugoogle_Ugoogletest_Slibgtest.so(_ZN7testing8UnitTest3RunEv+0x6b) [0x7267bdacacfb] testing::UnitTest::Run()
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(main+0x21) [0x5b4343a13c81] main
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7267bd229d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7267bd229e40] __libc_start_main
/home/ubuntu/.cache/bazel/_bazel_ubuntu/022b22f65a4e747315307f4ccee1785a/execroot/io_ray/bazel-out/k8-opt/bin/src/ray/raylet/tests/lease_dependency_manager_test.runfiles/io_ray/src/ray/raylet/tests/lease_dependency_manager_test(+0x1af55) [0x5b4343a0af55] _start
```

It passes consistently after the fix.

---------

Signed-off-by: irabbani <irabbani@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants