Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distributed.nanny.environ.MALLOC_TRIM_THRESHOLD_ is ineffective #5971

Closed
crusaderky opened this issue Mar 21, 2022 · 11 comments · Fixed by #6681
Closed

distributed.nanny.environ.MALLOC_TRIM_THRESHOLD_ is ineffective #5971

crusaderky opened this issue Mar 21, 2022 · 11 comments · Fixed by #6681
Assignees
Labels
bug Something is broken memory

Comments

@crusaderky
Copy link
Collaborator

crusaderky commented Mar 21, 2022

Ubuntu 21.10 x86/64
distributed 2022.3.0

The MALLOC_TRIM_THRESHOLD_ env variable seems to be effective at making memory deallocation more reactive.
However, the config variable that sets it doesn't seem to do anything - which indicates that the variable is being set after the worker process is started, whereas it should be set before spawning it.

import dask.array
import distributed

client = distributed.Client(n_workers=1, memory_limit="2 GiB")

N = 7_000
S = 160 * 1024

a = dask.array.random.random(N * S // 8, chunks=S // 8)
a = a.persist()
distributed.wait(a)
del a

Result:
Managed: 0
Unmanaged: 1.16 GiB

import os
import dask.array
import dask.config
import distributed

os.environ["MALLOC_TRIM_THRESHOLD_"] = str(dask.config.get("distributed.nanny.environ.MALLOC_TRIM_THRESHOLD_"))
client = distributed.Client(n_workers=1, memory_limit="2 GiB")

N = 7_000
S = 160 * 1024

a = dask.array.random.random(N * S // 8, chunks=S // 8)
a = a.persist()
distributed.wait(a)
del a

Result:
Managed: 0
Unmanaged: 151 MiB

Production Workaround

Set the env variable on the shell, before starting dask-worker:

export MALLOC_TRIM_THRESHOLD_=65536
dask-worker <address>
@crusaderky crusaderky self-assigned this Mar 21, 2022
@crusaderky
Copy link
Collaborator Author

I'm uncertain about how to solve this. The simple solution, to change os.environ in Nanny.__init__ instead of passing them down to Worker, would also mean poisoning the whole process of the nanny. This is annoying for unit tests but I'm not sure if anybody cares in production?

The alternative is to have an intermediate process which sets the variables and then invokes python again, which however is very expensive.

This issue also impacts the other two variables set by the config:

  OMP_NUM_THREADS: 1
  MKL_NUM_THREADS: 1

AFAIK, if for any reason numpy is imported on the worker process before the config, these two variables will not be picked up.

@gjoseph92
Copy link
Collaborator

if for any reason numpy is imported on the worker process before the config

Quite possible: #5729

The simple solution, to change os.environ in Nanny.__init__ instead of passing them down to Worker, would also mean poisoning the whole process of the nanny

To be fair, all the environment variables we're currently talking about setting (malloc_trim and num_threads) basically don't have an impact unless they're set before the interpreter starts. So for these specifically, setting them in the Nanny shouldn't actually change anything in practice. I still dislike the uncleanliness of setting them in the Nanny process, though.

The alternative is to have an intermediate process which sets the variables and then invokes python again

Could also have a process-wide lock for Nanny, and set os.environ in the parent process while holding that lock, then reset it before releasing. You'd want to be clever and still allow subprocesses with the same env to be spawned in parallel, so dask-worker --nworkers=100 is still performant. Still not perfectly clean, but maybe an acceptable tradeoff between cleanliness and performance.

@fjetter
Copy link
Member

fjetter commented Mar 23, 2022

setting them in the Nanny shouldn't actually change anything in practice.

They'll be set before the worker process starts. the worker process is where it matters

@gjoseph92
Copy link
Collaborator

I meant having them set in the Nanny process won't really affect things for the Nanny itself. Guido and I don't like the poor hygiene of leaving them set on the Nanny, but I'm just noting that that poor hygiene shouldn't affect anything on the Nanny in practice because the particular variables we're setting only have an effect at interpreter startup/NumPy import time.

@crusaderky
Copy link
Collaborator Author

I was worried about potential user-defined variables, not the three we set. But I'm leaning towards not over-engineering this just to cover purely hypothetical use cases.

@dagibbs22
Copy link

dagibbs22 commented Dec 6, 2023

I'm running a Jupyter Lab notebook in Ubuntu 22.04.2 LTS where unmanaged memory isn't being released. Dask is running using Coiled. After about 30 seconds of running my notebook, unmanaged memory appears and stays high for my long-running tasks. I have dask and distributed versions 2023.11.0 through conda-forge. I'm trying to follow the workaround here but having trouble with it.

When I include os.environ["MALLOC_TRIM_THRESHOLD_"] = str(dask.config.get("distributed.nanny.environ.MALLOC_TRIM_THRESHOLD_")) in my notebook import cell, I get KeyError: 'MALLOC_TRIM_THRESHOLD_'. How do I "Set the env variable on the shell, before starting dask-worker:" as mentioned above with

export MALLOC_TRIM_THRESHOLD_=65536
dask-worker <address>

My imports are currently:

import coiled
import dask
from dask.distributed import Client, LocalCluster
import dask.config
import distributed
dask.config.set({'distributed.nanny.environ.MALLOC_TRIM_THRESHOLD_': 5})

This seems to set the MALLOC_TRIM_THRESHOLD_ variable correction; after I create my client, I check the variable with client.run(os.getenv, "MALLOC_TRIM_THRESHOLD_")

And get

{'tls://10.1.32.137:33565': '5',
 'tls://10.1.33.247:40209': '5',
 'tls://10.1.35.138:32803': '5',
 'tls://10.1.41.133:46629': '5',
 'tls://10.1.41.140:41265': '5',
 'tls://10.1.44.62:41999': '5',
 'tls://10.1.45.159:35541': '5',
 'tls://10.1.47.63:38407': '5'}

But unmanaged memory still increases after about 30 seconds and stays high. That eventually causes my model to fail.

How do I trim the unmanaged memory? Thanks very much.

@crusaderky
Copy link
Collaborator Author

@dagibbs22 the workarounds described above are very old. This issue was resolved in July 2022.
If you're unhappy with the default that dask sets you can change it through dask config:

import dask
import coiled
dask.config.set({"distributed.nanny.pre-spawn-environ.MALLOC_TRIM_THRESHOLD_: your_value_here})
cluster = coiled.Cluster(...)

This said, I'd be honestly surprised if tampering with the setting were to fix your issue, and if your unmanaged memory does disappear I would love to see your code.

@dagibbs22
Copy link

Thanks, @crusaderky . I could tell the issue had been resolved but couldn't tell what it was...

Adding dask.config.set({"distributed.nanny.pre-spawn-environ.MALLOC_TRIM_THRESHOLD_: 1}) didn't reduce my unmanaged memory but did keep the model running despite high unmanaged memory. That's not great, either, but at least it demonstrates that I need to look somewhere else to trim my unmanaged memory. Why did you think (correctly) that changing MALLOC_TRIM_THRESHOLD_ wouldn't make my unmanaged memory disappear? Do you have other suggestions for how to keep unmanaged memory from accumulating?

dagibbs22 added a commit to wri/carbon-budget-Europe that referenced this issue Dec 8, 2023
… on the full time series. Even on 2012-2021, unmanaged memory increased over time, getting into orange and then red for all workers. Eventually, workers died, but somehow they restarted and finished the time series. The same thing happened with the full time series two times (workers in the red zone for memory and then dying) but I guess it just happened too many times and eventually the model died. So, adding dask.config.set({"distributed.nanny.pre-spawn-environ.MALLOC_TRIM_THRESHOLD_": 1}) based on my conversation at dask/distributed#5971 (comment) didn't actually reduce unamanaged memory but did make the model push through the accumulated unmanaged memory, at least one or two times. Of course, this isn't a viable solution overall. But it is good data; unmanaged memory accumulation isn't due to MALLOC_TRIM_THRESHOLD_.
@crusaderky
Copy link
Collaborator Author

@dagibbs22 there are many causes for unmanaged memory, listed here: https://distributed.dask.org/en/stable/worker-memory.html#using-the-dashboard-to-monitor-memory-usage

Is unmanaged memory persisting while there are no tasks running? If it goes away, it's heap memory and you have to reduce the size of your chunks/partitions.

@dagibbs22
Copy link

@crusaderky The old unmanaged memory gets up to about 6 GB in each worker when I run my notebook but drops to 2 GB per worker after the notebook finishes. Does 2 GB/worker count as "memory persisting while there are no tasks running"? Why would memory persist like that? Thanks.

@crusaderky
Copy link
Collaborator Author

Some of it will be logs. dask workers store log information in deques for forensic analysis. You can shorten them through the dask config:

distributed:
    admin:
        log-length: 0
        low-level-log-length: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken memory
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants