Skip to content

Conversation

@israbbani
Copy link
Contributor

For more details about the resource isolation project see #54703.

When starting the head node, move the dashboard api server's subprocesses into the system cgroup. I updated the integration test and added a helpful error message because the test will break in the future when a new dashboard module is added.

I ran the integration tests 25 times locally.

(ray2) ubuntu@devbox:~/code/ray2$ python -m pytest -s python/ray/tests/resource_isolation/test_resource_isolation_integration.py --count 25 -x
...
collecting ...
python/ray/tests/resource_isolation/test_resource_isolation_integration.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 25% ██▌ 2025-10-17 23:13:51,897 INFO worker.py:1833 -- Connecting to existing Ray cluster at address: 172.31.12.251:6379...
2025-10-17 23:13:51,905 INFO worker.py:2004 -- Connected to Ray cluster. View the dashboard at http://127.0.0.1:8265
python/ray/tests/resource_isolation/test_resource_isolation_integration.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 26% ██▋ 2025-10-17 23:13:57,592 INFO worker.py:1833 -- Connecting to existing Ray cluster at address: 172.31.12.251:6379...
2025-10-17 23:13:57,598 INFO worker.py:2004 -- Connected to Ray cluster. View the dashboard at http://127.0.0.1:8265
python/ray/tests/resource_isolation/test_resource_isolation_integration.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 98% █████████▊2025-10-17 23:19:45,417 INFO worker.py:2004 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265
python/ray/tests/resource_isolation/test_resource_isolation_integration.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 99% █████████▉2025-10-17 23:19:50,194 INFO worker.py:2004 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265
python/ray/tests/resource_isolation/test_resource_isolation_integration.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
100% ██████████
Results (366.41s):
100 passed

into the system cgroup

Signed-off-by: irabbani <israbbani@gmail.com>
@israbbani israbbani added core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests labels Oct 17, 2025
israbbani and others added 3 commits October 17, 2025 17:16
Signed-off-by: irabbani <israbbani@gmail.com>
… irabbani/cgroups-20

Signed-off-by: irabbani <israbbani@gmail.com>
@israbbani israbbani marked this pull request as ready for review October 18, 2025 03:49
@israbbani israbbani requested a review from a team as a code owner October 18, 2025 03:49
cursor[bot]

This comment was marked as outdated.

@edoakes edoakes merged commit b467908 into master Oct 20, 2025
6 checks passed
@edoakes edoakes deleted the irabbani/cgroups-20 branch October 20, 2025 13:10
xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Oct 22, 2025
…ystem cgroup (ray-project#57864)

For more details about the resource isolation project see
ray-project#54703.

When starting the head node, move the dashboard api server's
subprocesses into the system cgroup. I updated the integration test and
added a helpful error message because the test will break in the future
when a new dashboard module is added.

I ran the integration tests 25 times locally.

> (ray2) ubuntu@devbox:~/code/ray2$ python -m pytest -s
python/ray/tests/resource_isolation/test_resource_isolation_integration.py
--count 25 -x
...
collecting ...

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 25% ██▌ 2025-10-17 23:13:51,897 INFO
worker.py:1833 -- Connecting to existing Ray cluster at address:
172.31.12.251:6379...
2025-10-17 23:13:51,905 INFO worker.py:2004 -- Connected to Ray cluster.
View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 26% ██▋ 2025-10-17 23:13:57,592 INFO
worker.py:1833 -- Connecting to existing Ray cluster at address:
172.31.12.251:6379...
2025-10-17 23:13:57,598 INFO worker.py:2004 -- Connected to Ray cluster.
View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 98% █████████▊2025-10-17 23:19:45,417 INFO
worker.py:2004 -- Started a local Ray instance. View the dashboard at
http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
99% █████████▉2025-10-17 23:19:50,194 INFO worker.py:2004 -- Started a
local Ray instance. View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
100% ██████████
Results (366.41s):
     100 passed

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: xgui <xgui@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Oct 23, 2025
…ystem cgroup (#57864)

For more details about the resource isolation project see
#54703.

When starting the head node, move the dashboard api server's
subprocesses into the system cgroup. I updated the integration test and
added a helpful error message because the test will break in the future
when a new dashboard module is added.

I ran the integration tests 25 times locally. 

> (ray2) ubuntu@devbox:~/code/ray2$ python -m pytest -s
python/ray/tests/resource_isolation/test_resource_isolation_integration.py
--count 25 -x
...
collecting ... 

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 25% ██▌ 2025-10-17 23:13:51,897 INFO
worker.py:1833 -- Connecting to existing Ray cluster at address:
172.31.12.251:6379...
2025-10-17 23:13:51,905 INFO worker.py:2004 -- Connected to Ray cluster.
View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 26% ██▋ 2025-10-17 23:13:57,592 INFO
worker.py:1833 -- Connecting to existing Ray cluster at address:
172.31.12.251:6379...
2025-10-17 23:13:57,598 INFO worker.py:2004 -- Connected to Ray cluster.
View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 98% █████████▊2025-10-17 23:19:45,417 INFO
worker.py:2004 -- Started a local Ray instance. View the dashboard at
http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
99% █████████▉2025-10-17 23:19:50,194 INFO worker.py:2004 -- Started a
local Ray instance. View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
100% ██████████
Results (366.41s):
     100 passed

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…ystem cgroup (ray-project#57864)

For more details about the resource isolation project see
ray-project#54703.

When starting the head node, move the dashboard api server's
subprocesses into the system cgroup. I updated the integration test and
added a helpful error message because the test will break in the future
when a new dashboard module is added.

I ran the integration tests 25 times locally. 

> (ray2) ubuntu@devbox:~/code/ray2$ python -m pytest -s
python/ray/tests/resource_isolation/test_resource_isolation_integration.py
--count 25 -x
...
collecting ... 

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 25% ██▌ 2025-10-17 23:13:51,897 INFO
worker.py:1833 -- Connecting to existing Ray cluster at address:
172.31.12.251:6379...
2025-10-17 23:13:51,905 INFO worker.py:2004 -- Connected to Ray cluster.
View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 26% ██▋ 2025-10-17 23:13:57,592 INFO
worker.py:1833 -- Connecting to existing Ray cluster at address:
172.31.12.251:6379...
2025-10-17 23:13:57,598 INFO worker.py:2004 -- Connected to Ray cluster.
View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 98% █████████▊2025-10-17 23:19:45,417 INFO
worker.py:2004 -- Started a local Ray instance. View the dashboard at
http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
99% █████████▉2025-10-17 23:19:50,194 INFO worker.py:2004 -- Started a
local Ray instance. View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
100% ██████████
Results (366.41s):
     100 passed

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…ystem cgroup (ray-project#57864)

For more details about the resource isolation project see
ray-project#54703.

When starting the head node, move the dashboard api server's
subprocesses into the system cgroup. I updated the integration test and
added a helpful error message because the test will break in the future
when a new dashboard module is added.

I ran the integration tests 25 times locally.

> (ray2) ubuntu@devbox:~/code/ray2$ python -m pytest -s
python/ray/tests/resource_isolation/test_resource_isolation_integration.py
--count 25 -x
...
collecting ...

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 25% ██▌ 2025-10-17 23:13:51,897 INFO
worker.py:1833 -- Connecting to existing Ray cluster at address:
172.31.12.251:6379...
2025-10-17 23:13:51,905 INFO worker.py:2004 -- Connected to Ray cluster.
View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 26% ██▋ 2025-10-17 23:13:57,592 INFO
worker.py:1833 -- Connecting to existing Ray cluster at address:
172.31.12.251:6379...
2025-10-17 23:13:57,598 INFO worker.py:2004 -- Connected to Ray cluster.
View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓ 98% █████████▊2025-10-17 23:19:45,417 INFO
worker.py:2004 -- Started a local Ray instance. View the dashboard at
http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
99% █████████▉2025-10-17 23:19:50,194 INFO worker.py:2004 -- Started a
local Ray instance. View the dashboard at http://127.0.0.1:8265

python/ray/tests/resource_isolation/test_resource_isolation_integration.py
✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓
100% ██████████
Results (366.41s):
     100 passed

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants