Skip to content

Tuning excessive memory retention #351

@hp48gx

Description

@hp48gx

Hi,

just want to share some information about an experiment we did, that looks a bit suspicious.

We have a binary that starts doing nothing, periodically it picks a random thread from a pool of 20, this thread iterates through a directory full of fairly large xml files (kept fixed, say, 10 files, 50 MB each); each file is parsed using pugixml (a DOM parser, that basically copies the entire file in memory in a big string), and then all the data is dropped (for the purpose of this experiment, we just return true/false if the file was valid xml).
In particular, everything happens within a single thread: the parser is created, triggered and destroyed every time; nothing is passed to another thread. Also, there are no leaks, as we also run it under valgrind (without mimalloc).

What we noticed is that whenever a new thread in the pool picks up the task, memory grows significantly (say 150MB), and never decreases; then, when the same thread runs again, memory consumption is roughly flat.
These numbers would be explained if mimalloc does not return the memory back to the OS.

Here's what we tried so far:

  1. mi_collect true/false has no effect (we run it in each thread, just before stats_merge)
  2. starting with MIMALLOC_PAGE_RESET=1 has no effect on RSS, but it's visible in the stats:
heap stats:     peak      total      freed       unit      count  
  reserved:   512.0 mb   512.0 mb    12.0 kb       1 b              not all freed!
 committed:   690.0 mb   690.0 mb    12.0 kb       1 b              not all freed!
     reset:    89.2 mb   351.0 mb   461.6 mb       1 b              ok

   process: user: 32.399 s, system: 13.636 s, faults: 5, reclaims: 179973, rss: 737.7 mb
heap stats:     peak      total      freed       unit      count  
  reserved:   512.0 mb   512.0 mb     8.0 kb       1 b              not all freed!
 committed:   687.5 mb   687.5 mb     8.0 kb       1 b              not all freed!
     reset:       0 b        0 b        0 b        1 b              ok

   process: user: 27.284 s, system: 10.617 s, faults: 3, reclaims: 177145, rss: 726.8 mb
  1. we tried using a different heap for the xml parser (pugixml supports passing two pointers to "malloc" and "free"). that was by far the worse solution. the heap grows as much as 4GB during parsing, and when we destroy it, about 25% is kept by the process.

  2. we tried destroying the threads after parsing. basically no effect, memory usage is slightly higher during parsing, and eventually no memory is released to the OS.

Is there anything we should tune?
thanks in advance for the hints

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions