Fix batch size handling during metadata during channel content upload #5729
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I spend some time profiling the database access code and noticed that the Tribler application spends most of the time inside
db_session
wrapped around theprocess_payload
method of MetadataStore.As it turns out, MetadataStore split objects into batches of very small size, and on my machine typical batch size was just two metadata objects.
MetadataStore determines batch size dynamically depending on the duration of the previous batch execution, but the calculation includes sleep time between batches as well, which skewed results.
I fixed the formula and also restricted the minimum and maximum batch size.
After the changes presented in this pull request, the rate of metadata objects insertion on my machine increased by 23%, from 60.63 objects/seconds to 74.84 objects/seconds for the first 4000 objects loaded after the start of the application.
Over the course of the time the application is running, the speed of inserts execution drops significantly, and this does not depend on the size of the database, but on the time the application is running. This should be the next important topic for research since the drop is quite significant.