Fix excessive memory usage when exporting projects with multiple video tasks #7374

SpecLad · 2024-01-18T13:39:09Z

Motivation and context

Currently, the following situation causes the export worker to consume more memory than necessary:

A user exports a project with images included, and
The project contains multiple tasks with video chunks.

The main reason for this is that the image_maker_per_task dictionary in CVATProjectDataExtractor.__init__ indirectly references a distinct FrameProvider for each task, which in turn, indirectly references an av.containers.Container object corresponding to the most recent chunk opened in that task.

Initially, none of the chunks are opened, so the memory usage is low. But as Datumaro iterates over each frame, it eventually requests at least one image from each task of the project. This causes the corresponding FrameProvider to open a chunk file and keep a handle to it. The only way for a FrameProvider to close this chunk file is to open a new chunk when a frame
from that chunk is requested. Therefore, each FrameProvider keeps at least one chunk open from the moment Datumaro requests the first frame from its task and until the end of the export.

This manifests in the export worker's memory usage growing and growing as Datumaro goes from task to task. An open chunk consumes whatever Python objects represent it, but more importantly, any C-level buffers allocated by FFmpeg, which can be quite significant. In my testing, the size of the per-chunk memory was on the order of tens of megabytes. An open chunk also takes up a file descriptor.

The fix for this is conceptually simple: ensure that only one FrameProvider object exists at a time. AFAICS, when a project is exported, all frames from a given task are grouped together, so we shouldn't need multiple tasks' chunks to be open at the same time anyway.

I had to restructure the code to make this work, so I took the opportunity to fix a few other things, as well:

The code currently relies on garbage collection of PyAV's Container objects to free the resources used by them. Even though VideoReader.__iter__ uses a with block to close the container, the with block can only do so if the container is iterated all the way to the end. This doesn't happen when FrameProvider uses it, since FrameProvider seeks to the needed frame and then stops iterating.

I don't have evidence that this causes any issues at the moment, but Python does not guarantee that objects are GC'd promptly, so this could become another source of excessive memory usage.

I added cleanup methods (close/unload/__exit__) at several layers of the code to ensure that each chunk is closed as soon as it is no longer needed.
I factored out and merged the code used to generate dm.Image objects when exporting projects and jobs/tasks. It's likely possible to merge even more code, but I don't want to expand the scope of the patch too much.
I fixed a seemingly useless optimization in the handling for 3D frames. Specifically, CVATProjectDataExtractor performs queries of the form:
```
task.data.images.prefetch_related().get(id=i)
```
The prefetch here looks useless, as only a single object is queried - it wouldn't be any less efficient to just let Django fetch the Image's related_files when needed.

I rewrote this code to prefetch related_files for all images in the task at once instead.

How has this been tested?

I used memray in combination with a trivial script:

#!/usr/bin/env python

import django

django.setup()

import cvat.apps.dataset_manager.views

cvat.apps.dataset_manager.views.export_project_as_dataset(
    1, "YOLO 1.1", "http://localhost:8080")

I also did some manual sanity checks to ensure that export still works as intended.

Checklist

I submit my changes into the develop branch
I have created a changelog fragment
~~[ ] I have updated the documentation accordingly~~
~~[ ] I have added tests to cover my changes~~
~~[ ] I have linked related issues (see GitHub docs)~~
[ ] I have increased versions of npm packages if it is necessary
(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.

SpecLad · 2024-01-18T14:01:28Z

/check

github-actions · 2024-01-18T14:01:39Z

❌ Some checks failed
📄 See logs here

SpecLad · 2024-01-18T15:06:41Z

/check

github-actions · 2024-01-18T15:06:55Z

✔️ All checks completed successfully
📄 See logs here

…o tasks Currently, the following situation causes the export worker to consume more memory than necessary: * A user exports a project with images included, and * The project contains multiple tasks with video chunks. The main reason for this is that the `image_maker_per_task` dictionary in `CVATProjectDataExtractor.__init__` indirectly references a distinct `FrameProvider` for each task, which in turn, indirectly references an `av.containers.Container` object corresponding to the most recent chunk opened in that task. Initially, none of the chunks are opened, so the memory usage is low. But as Datumaro iterates over each frame, it eventually requests at least one image from each task of the project. This causes the corresponding `FrameProvider` to open a chunk file and keep a handle to it. The only way for a `FrameProvider` to close this chunk file is to open a new chunk when a frame from that chunk is requested. Therefore, each `FrameProvider` keeps at least one chunk open from the moment Datumaro requests the first frame from its task and _until the end of the export_. This manifests in the export worker's memory usage growing and growing as Datumaro goes from task to task. An open chunk consumes whatever Python objects represent it, but more importantly, any C-level buffers allocated by FFmpeg, which can be quite significant. In my testing, the size of the per-chunk memory was on the order of tens of megabytes. An open chunk also takes up a file descriptor. The fix for this is conceptually simple: ensure that only one `FrameProvider` object exists at a time. AFAICS, when a project is exported, all frames from a given task are grouped together, so we shouldn't need multiple tasks' chunks to be open at the same time anyway. I had to restructure the code to make this work, so I took the opportunity to fix a few other things, as well: * The code currently relies on garbage collection of PyAV's `Container` objects to free the resources used by them. Even though `VideoReader.__iter__` uses a `with` block to close the container, the `with` block can only do so if the container is iterated all the way to the end. This doesn't happen when `FrameProvider` uses it, since `FrameProvider` seeks to the needed frame and then stops iterating. I don't have evidence that this causes any issues at the moment, but Python does not guarantee that objects are GC'd promptly, so this could become another source of excessive memory usage. I added cleanup methods (`close`/`unload`/`__exit__`) at several layers of the code to ensure that each chunk is closed as soon as it is no longer needed. * I factored out and merged the code used to generate `dm.Image` objects when exporting projects and jobs/tasks. It's likely possible to merge even more code, but I don't want to expand the scope of the patch too much. * I fixed a seemingly useless optimization in the handling for 3D frames. Specifically, `CVATProjectDataExtractor` performs queries of the form: task.data.images.prefetch_related().get(id=i) The prefetch here looks useless, as only a single object is queried - it wouldn't be any less efficient to just let Django fetch the `Image`'s `related_files` when needed. I rewrote this code to prefetch `related_files` for all images in the task at once instead.

codecov · 2024-01-19T12:23:26Z

Codecov Report

Merging #7374 (6de6db8) into develop (4cf45e5) will increase coverage by 0.08%.
The diff coverage is 97.39%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #7374      +/-   ##
===========================================
+ Coverage    83.19%   83.28%   +0.08%     
===========================================
  Files          373      373              
  Lines        39714    39760      +46     
  Branches      3720     3720              
===========================================
+ Hits         33039    33113      +74     
+ Misses        6675     6647      -28

Components	Coverage Δ
cvat-ui	`78.97% <ø> (+0.04%)`	⬆️
cvat-server	`87.24% <97.39%> (+0.12%)`	⬆️

SpecLad force-pushed the project-export-memory-consumption branch from f6c4a98 to 124e975 Compare January 18, 2024 13:58

SpecLad force-pushed the project-export-memory-consumption branch from 124e975 to 4b3e8d1 Compare January 18, 2024 14:55

SpecLad force-pushed the project-export-memory-consumption branch from 4b3e8d1 to 5e1698b Compare January 19, 2024 11:39

SpecLad changed the title ~~Resolve excessive memory usage when exporting projects with multiple video tasks~~ Fix excessive memory usage when exporting projects with multiple video tasks Jan 19, 2024

SpecLad marked this pull request as ready for review January 19, 2024 11:46

SpecLad requested review from zhiltsov-max, Marishka17 and nmanovic as code owners January 19, 2024 11:46

zhiltsov-max approved these changes Jan 24, 2024

View reviewed changes

Merge branch 'develop' into project-export-memory-consumption

6de6db8

SpecLad merged commit 45c4356 into cvat-ai:develop Jan 24, 2024
34 checks passed

SpecLad deleted the project-export-memory-consumption branch January 24, 2024 13:38

cvat-bot bot mentioned this pull request Jan 26, 2024

Release v2.10.2 #7401

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix excessive memory usage when exporting projects with multiple video tasks #7374

Fix excessive memory usage when exporting projects with multiple video tasks #7374

SpecLad commented Jan 18, 2024 •

edited

Loading

SpecLad commented Jan 18, 2024

github-actions bot commented Jan 18, 2024 •

edited

Loading

SpecLad commented Jan 18, 2024

github-actions bot commented Jan 18, 2024 •

edited

Loading

codecov bot commented Jan 19, 2024 •

edited

Loading

Fix excessive memory usage when exporting projects with multiple video tasks #7374

Fix excessive memory usage when exporting projects with multiple video tasks #7374

Conversation

SpecLad commented Jan 18, 2024 • edited Loading

Motivation and context

How has this been tested?

Checklist

License

SpecLad commented Jan 18, 2024

github-actions bot commented Jan 18, 2024 • edited Loading

SpecLad commented Jan 18, 2024

github-actions bot commented Jan 18, 2024 • edited Loading

codecov bot commented Jan 19, 2024 • edited Loading

Codecov Report

SpecLad commented Jan 18, 2024 •

edited

Loading

github-actions bot commented Jan 18, 2024 •

edited

Loading

github-actions bot commented Jan 18, 2024 •

edited

Loading

codecov bot commented Jan 19, 2024 •

edited

Loading