[Object Spilling] Plasma store uses more shared memory than object_store_memory sometimes when spilling #14182

rkooo567 · 2021-02-18T07:59:17Z

What is the problem?

It looks like sometimes (when there's a huge memory pressure), plasma store uses more memory than object_store_memory, which causes the SIGBUS. For example, I ran one stressful spilling workload, and I received SIGBUS while my /dev/shm size was 120GB and the object store memory limit was 80GB. When I killed the raylet, all memory was freed.

Reproduction (REQUIRED)

Please provide a short code snippet (less than 50 lines if possible) that can be copy-pasted to reproduce the issue. The snippet should have no external library dependencies (i.e., use fake or mock data / environments):

No reproduction now, but it should be relatively easy to create the one.

I have verified my script runs in a clean environment and reproduces the issue.
I have verified the issue also occurs with the latest wheels.

The text was updated successfully, but these errors were encountered:

ericl · 2021-05-04T22:18:13Z

I can reproduce this pretty easily by writing big files to /dev/shm (cat /dev/zero > /dev/shm/file) and then trying to put more objects in Ray. I think the underlying issue is we are not catching a failed allocation correctly, trigger SIGBUS somewhere in plasma's memory allocator (cc @suquark ).

ericl · 2021-05-06T18:23:51Z

It seems the issue here is pretty fundamental to the way we are using mmap. We are creating a file with unallocated pages in /dev/shm (to avoid immediately using memory). However, this means the application can get SIGBUS at any point if here are no more pages allocatable in /dev/shm.

Possible alternatives include:

dynamically growing the file in /dev/shm
zeroing out all new allocated pages in plasma to catch SIGBUS

ericl · 2021-05-07T22:17:59Z

@rkooo567 can you try using the flag? If it works, I think we can add this to our documentation.

rkooo567 · 2021-05-07T23:33:27Z

The issue wasn't reproducible always. But I can try with 100GB shuffle just to see how slow it is when initiated. I can @clarkzinzow can also verify it with his Uber workload.

ericl · 2021-05-10T17:55:00Z

FYI the workaround is to set RAY_PREALLOCATE_PLASMA_MEMORY=1; available in nightly builds (introduced in https://github.com/ray-project/ray/pull/15669/files)

stephanie-wang · 2021-05-31T22:56:20Z

#15951 fixes this issue, but it can cause ObjectStoreFullErrors instead, since the underlying issue of fragmentation is still there. @ericl is working on a potential fix in #16097, which uses /tmp as a fallback when plasma runs out of shared memory.

rkooo567 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Feb 18, 2021

rkooo567 added this to the IO Bugs milestone Feb 18, 2021

rkooo567 added P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Feb 18, 2021

rkooo567 changed the title ~~[Object Spilling] Plasma store uses more shared memory than object_store_memory~~ [Object Spilling] Plasma store uses more shared memory than object_store_memory sometimes when spilling Feb 18, 2021

ericl added P1 Issue that should be fixed within a few weeks and removed P2 Important issue, but not time-critical labels May 3, 2021

ericl self-assigned this May 3, 2021

ericl modified the milestones: IO Bugs, Core Bugs May 3, 2021

ericl mentioned this issue May 6, 2021

Add prepopulate plasma memory flag for debugging #15669

Merged

This was referenced May 10, 2021

[Release] streaming_shuffle.py fails with core dumped #15319

Closed

Increase the raylet start wait timeout to accomodate plasma preallocation #15860

Merged

ericl mentioned this issue May 20, 2021

Prevent object store from allocating over the specified limit even if there is memory fragmentation #15951

Merged

ericl closed this as completed May 25, 2021

ericl mentioned this issue May 25, 2021

Ray 1.3.0 CHECK fail during normal task scheduling #15990

Closed

2 tasks

stephanie-wang reopened this May 31, 2021

This was referenced Jun 8, 2021

why shm_avail should be greater than object_store_memory #16308

Closed

[Core] Intermittent SIGSEGV reported for Plasma in Ray Core (NOTE: should add to nightly tests once resolved) #16342

Closed

ericl closed this as completed Jun 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Object Spilling] Plasma store uses more shared memory than object_store_memory sometimes when spilling #14182

[Object Spilling] Plasma store uses more shared memory than object_store_memory sometimes when spilling #14182

rkooo567 commented Feb 18, 2021 •

edited

Loading

ericl commented May 4, 2021

ericl commented May 6, 2021

ericl commented May 7, 2021

rkooo567 commented May 7, 2021 •

edited

Loading

ericl commented May 10, 2021 •

edited

Loading

stephanie-wang commented May 31, 2021

[Object Spilling] Plasma store uses more shared memory than object_store_memory sometimes when spilling #14182

[Object Spilling] Plasma store uses more shared memory than object_store_memory sometimes when spilling #14182

Comments

rkooo567 commented Feb 18, 2021 • edited Loading

What is the problem?

Reproduction (REQUIRED)

ericl commented May 4, 2021

ericl commented May 6, 2021

ericl commented May 7, 2021

rkooo567 commented May 7, 2021 • edited Loading

ericl commented May 10, 2021 • edited Loading

stephanie-wang commented May 31, 2021

rkooo567 commented Feb 18, 2021 •

edited

Loading

rkooo567 commented May 7, 2021 •

edited

Loading

ericl commented May 10, 2021 •

edited

Loading