Reduce RAM consumption during MLonMCU Flow #155

PhilippvK · 2024-03-29T09:51:11Z

When running large benchmarks session with intruction-level tracing I am sometimes seeing peek RAM usage of ~100GB on a workstation equipped with 128GB. The situation can be described es follows:

The etiss target can only trace instructions to stdout (while spike can write the trace directly to disk)
These traces can easily get more than 5GB large
After the run completed, the trace is stored as an artifact in RAM
After the RUN stage, all artifacts are exported to disk by MLonMCU.
The raw/plaintext trace data is still kept in memory to avoid reading from disk, even if not required anymore.

The first and second plot in this figure should visualize this behavior for an example session (thread_pool_ps peaking at ~40GB RAM for the approach using ThreadPoolExecutor and runs_per_stage=1, see #153 for more context):

Here are some thoughts about potential solutions for the mentioned problems:

Do not keep artifacts in memory after export (only read on-demand)
Add artifact.cache() method to keep artifact in memory until session is closed
Allow compressing artifacts (with automatic decompression if used) -> might lead to larger peak disk space footprint
Add setting to compress all run directories after session exits
Allow cleaning up temp run dir (only keep artifacts + rm platform build dir)

The text was updated successfully, but these errors were encountered:

PhilippvK self-assigned this Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce RAM consumption during MLonMCU Flow #155

Reduce RAM consumption during MLonMCU Flow #155

PhilippvK commented Mar 29, 2024

Reduce RAM consumption during MLonMCU Flow #155

Reduce RAM consumption during MLonMCU Flow #155

Comments

PhilippvK commented Mar 29, 2024