Reduce memory usage of DELETE operations #11470

Mytherin · 2024-04-02T13:57:01Z

When a delete is executed, we push information about that delete into the UndoBuffer. This information allows us to then later commit or rollback the actual delete. This happens in the form of DeleteInfo structs. When deleting a lot of data - for example when running a DELETE FROM large_tbl command - many of these structs are created. Since the UndoBuffer structure does not support offloading to disk (yet) this could lead to out-of-memory exceptions.

This PR improves the memory efficiency of the DeleteInfo struct in two ways:

We switch the row field from row_t to uint16_t. The rows that are stored are relative to the vector they refer to. As such these values can never exceed STANDARD_VECTOR_SIZE. As such, we can always store these values in a uint16_t. This reduces memory usage of the DeleteInfo struct by up to 4x.
When the deleted row identifiers are consecutive (i.e. 0, 1, 2, 3, 4, 5, ...) we avoid storing the row identifiers at all. Instead, we store a boolean is_consecutive. If this is set, during actual delete operations, we reconstruct the row identifiers from the count. This improves memory usage even further, particularly when entire row groups or tables are deleted (as is the case in the previous DELETE FROM large_tbl command).

…asis, as such they cannot exceed the STANDARD_VECTOR_SIZE and we do not need to use 8-byte identifiers for them but can use a uint16_t instead

Merge pull request duckdb/duckdb#11470 from Mytherin/deletememoryusage Merge pull request duckdb/duckdb#11476 from carlopi/fix_lzma Merge pull request duckdb/duckdb#11432 from guenp/guenp/add-escape-to-filter Merge pull request duckdb/duckdb#11378 from lnkuiper/read_json_defer_allocation

Out-Of-Core Updates & Deletes This PR makes it possible to run `UPDATE` and `DELETE` statements where the changeset introduced by the `UPDATE` or `DELETE` exceeds memory. This can happen when running an `UPDATE` that updates a very large table, or by running a `DELETE` that deletes many non-contiguous rows (as contiguous deletes, such as what happens when running `DELETE FROM tbl`, use far less memory as per #11470). The way this works is that the `UndoBuffer`, which previously used an `ArenaAllocator`, is now modified to use buffer-managed blocks. * For regular operations, that is actually rather straightforward. When creating new entries we append to the `UndoBuffer` - only requiring us to pin the final block. * During a commit, rollback or clean-up of a transaction we do a full scan of the `UndoBuffer`. These scans only require us to pin individual blocks. The main challenge is in `UPDATE` statements. #### UpdateInfo rework The reason updates are challenging is that the `UPDATE` statements internally use a linked list, which was previously built using pointers. This PR reworks the `UpdateInfo` struct to instead hold `UndoBufferPointer` entries - which are essentially a reference to an undo-buffer allocated block plus an offset. The `UpdateSegment` class is reworked so that we correctly pin and unpin the `UpdateInfo` entries when we need to traverse the linked list. The `tuple_data` and `tuples` arrays, that specify which values are updated for which rows, are also no longer pointers (as these pointers can become invalidated if a buffer-managed block is evicted). Instead, we enforce that an `UpdateInfo` struct is always allocated in the following manner: ``` [UpdateInfo][TUPLES (sel_t[max])][DATA (T[max])] ``` We can then access the tuples and the data by looking forward past the struct to where these arrays reside. #### Performance My main consideration with this change was performance - as we no longer "just" follow pointers but instead need to pin and unpin whenever we use updates. However, after profiling, this does not seem to be a bottleneck. This is likely because the `UpdateInfo` already operates on a per-vector level, so we are only introducing an additional pin/unpin for every vector. Since these are designed to be lightweight when the data fits in memory, the performance impact in this scenario is minimal. In fact, because of our previous change of making the `UpdateInfo` always reside in contiguous memory, performance seems to have improved slightly. ```sql CREATE TABLE integers(i INT); INSERT INTO integers FROM range(10000000); UPDATE integers SET i=i+1; # old: 0.23s, new: 0.19s # read from updated values BEGIN; UPDATE integers SET i=i+1; FROM integers; # old: 0.015s~, new: 0.015s~ # update lineitem CALL dbgen(sf=1); UPDATE lineitem SET l_comment=concat(l_comment, l_comment); # old: 1.0s, new: 0.97s ```

Mytherin added 2 commits April 2, 2024 15:24

Reduce memory usage of DeleteInfo - rows are stored on a per-vector b…

da90a8b

…asis, as such they cannot exceed the STANDARD_VECTOR_SIZE and we do not need to use 8-byte identifiers for them but can use a uint16_t instead

Avoid storing rows in DeleteInfo when the deleted rows are consecutive

b7211b1

Mytherin merged commit 4842b82 into duckdb:main Apr 2, 2024
42 of 45 checks passed

Mytherin mentioned this pull request Apr 11, 2024

TRUNCATE <TABLE > Uses excessive amount of RAM #11571

Closed

1 task

Mytherin deleted the deletememoryusage branch June 7, 2024 12:53

Mytherin mentioned this pull request Jul 2, 2024

For compressed deletes in the undo buffer - count the actual size that will be written to the WAL when determining the auto-checkpoint threshold #12803

Merged

Mytherin mentioned this pull request Oct 25, 2024

Out-Of-Core Updates & Deletes #14559

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory usage of DELETE operations #11470

Reduce memory usage of DELETE operations #11470

Mytherin commented Apr 2, 2024

Reduce memory usage of DELETE operations #11470

Reduce memory usage of DELETE operations #11470

Conversation

Mytherin commented Apr 2, 2024