-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix excessive memory usage when exporting projects with multiple video tasks #7374
Merged
SpecLad
merged 2 commits into
cvat-ai:develop
from
SpecLad:project-export-memory-consumption
Jan 24, 2024
Merged
Fix excessive memory usage when exporting projects with multiple video tasks #7374
SpecLad
merged 2 commits into
cvat-ai:develop
from
SpecLad:project-export-memory-consumption
Jan 24, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
f6c4a98
to
124e975
Compare
/check |
❌ Some checks failed |
124e975
to
4b3e8d1
Compare
/check |
✔️ All checks completed successfully |
…o tasks Currently, the following situation causes the export worker to consume more memory than necessary: * A user exports a project with images included, and * The project contains multiple tasks with video chunks. The main reason for this is that the `image_maker_per_task` dictionary in `CVATProjectDataExtractor.__init__` indirectly references a distinct `FrameProvider` for each task, which in turn, indirectly references an `av.containers.Container` object corresponding to the most recent chunk opened in that task. Initially, none of the chunks are opened, so the memory usage is low. But as Datumaro iterates over each frame, it eventually requests at least one image from each task of the project. This causes the corresponding `FrameProvider` to open a chunk file and keep a handle to it. The only way for a `FrameProvider` to close this chunk file is to open a new chunk when a frame from that chunk is requested. Therefore, each `FrameProvider` keeps at least one chunk open from the moment Datumaro requests the first frame from its task and _until the end of the export_. This manifests in the export worker's memory usage growing and growing as Datumaro goes from task to task. An open chunk consumes whatever Python objects represent it, but more importantly, any C-level buffers allocated by FFmpeg, which can be quite significant. In my testing, the size of the per-chunk memory was on the order of tens of megabytes. An open chunk also takes up a file descriptor. The fix for this is conceptually simple: ensure that only one `FrameProvider` object exists at a time. AFAICS, when a project is exported, all frames from a given task are grouped together, so we shouldn't need multiple tasks' chunks to be open at the same time anyway. I had to restructure the code to make this work, so I took the opportunity to fix a few other things, as well: * The code currently relies on garbage collection of PyAV's `Container` objects to free the resources used by them. Even though `VideoReader.__iter__` uses a `with` block to close the container, the `with` block can only do so if the container is iterated all the way to the end. This doesn't happen when `FrameProvider` uses it, since `FrameProvider` seeks to the needed frame and then stops iterating. I don't have evidence that this causes any issues at the moment, but Python does not guarantee that objects are GC'd promptly, so this could become another source of excessive memory usage. I added cleanup methods (`close`/`unload`/`__exit__`) at several layers of the code to ensure that each chunk is closed as soon as it is no longer needed. * I factored out and merged the code used to generate `dm.Image` objects when exporting projects and jobs/tasks. It's likely possible to merge even more code, but I don't want to expand the scope of the patch too much. * I fixed a seemingly useless optimization in the handling for 3D frames. Specifically, `CVATProjectDataExtractor` performs queries of the form: task.data.images.prefetch_related().get(id=i) The prefetch here looks useless, as only a single object is queried - it wouldn't be any less efficient to just let Django fetch the `Image`'s `related_files` when needed. I rewrote this code to prefetch `related_files` for all images in the task at once instead.
4b3e8d1
to
5e1698b
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## develop #7374 +/- ##
===========================================
+ Coverage 83.19% 83.28% +0.08%
===========================================
Files 373 373
Lines 39714 39760 +46
Branches 3720 3720
===========================================
+ Hits 33039 33113 +74
+ Misses 6675 6647 -28
|
zhiltsov-max
approved these changes
Jan 24, 2024
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and context
Currently, the following situation causes the export worker to consume more memory than necessary:
The main reason for this is that the
image_maker_per_task
dictionary inCVATProjectDataExtractor.__init__
indirectly references a distinctFrameProvider
for each task, which in turn, indirectly references anav.containers.Container
object corresponding to the most recent chunk opened in that task.Initially, none of the chunks are opened, so the memory usage is low. But as Datumaro iterates over each frame, it eventually requests at least one image from each task of the project. This causes the corresponding
FrameProvider
to open a chunk file and keep a handle to it. The only way for aFrameProvider
to close this chunk file is to open a new chunk when a framefrom that chunk is requested. Therefore, each
FrameProvider
keeps at least one chunk open from the moment Datumaro requests the first frame from its task and until the end of the export.This manifests in the export worker's memory usage growing and growing as Datumaro goes from task to task. An open chunk consumes whatever Python objects represent it, but more importantly, any C-level buffers allocated by FFmpeg, which can be quite significant. In my testing, the size of the per-chunk memory was on the order of tens of megabytes. An open chunk also takes up a file descriptor.
The fix for this is conceptually simple: ensure that only one
FrameProvider
object exists at a time. AFAICS, when a project is exported, all frames from a given task are grouped together, so we shouldn't need multiple tasks' chunks to be open at the same time anyway.I had to restructure the code to make this work, so I took the opportunity to fix a few other things, as well:
The code currently relies on garbage collection of PyAV's
Container
objects to free the resources used by them. Even thoughVideoReader.__iter__
uses awith
block to close the container, thewith
block can only do so if the container is iterated all the way to the end. This doesn't happen whenFrameProvider
uses it, sinceFrameProvider
seeks to the needed frame and then stops iterating.I don't have evidence that this causes any issues at the moment, but Python does not guarantee that objects are GC'd promptly, so this could become another source of excessive memory usage.
I added cleanup methods (
close
/unload
/__exit__
) at several layers of the code to ensure that each chunk is closed as soon as it is no longer needed.I factored out and merged the code used to generate
dm.Image
objects when exporting projects and jobs/tasks. It's likely possible to merge even more code, but I don't want to expand the scope of the patch too much.I fixed a seemingly useless optimization in the handling for 3D frames. Specifically,
CVATProjectDataExtractor
performs queries of the form:The prefetch here looks useless, as only a single object is queried - it wouldn't be any less efficient to just let Django fetch the
Image
'srelated_files
when needed.I rewrote this code to prefetch
related_files
for all images in the task at once instead.How has this been tested?
I used memray in combination with a trivial script:
I also did some manual sanity checks to ensure that export still works as intended.
Checklist
develop
branch[ ] I have updated the documentation accordingly[ ] I have added tests to cover my changes[ ] I have linked related issues (see GitHub docs)[ ] I have increased versions of npm packages if it is necessary(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.