Skip to content

Conversation

@ijon
Copy link
Collaborator

@ijon ijon commented Jul 18, 2024

merged ebee36a from main

Add commit redo size check for successfully ignited operations as a precaution measure to avoid infinite loop of schemeshard hitting local tx commit redo size limit, restarting, attempting to propose persisted operation again, hitting commit redo size limit again, restarting and so on.

This could happen with inherently massive operations such as copy-tables used as a starting step of database export/backup.

Coping large number of tables with huge number of partitions can result in so large TTxOperationPropose local transaction that its size would hit the limit imposed by the tablet executor. Tablet violating that limit is considered broken and will be immediately stopped. See ydb/core/tablet_flat/flat_executor.cpp, NTabletFlatExecutor::TExecutor::ExecuteTransaction().

KIKIMR-21751

Changelog entry

Fix handling of backups of too massive databases (in terms of number of tables and number of table partitions).

Changelog category

  • Bugfix

…form#6760)

Add commit redo size check for successfully ignited operations as a precaution measure to avoid infinite loop of schemeshard hitting local tx commit redo size limit, restarting, attempting to propose persisted operation again, hitting commit redo size limit again, restarting and so on.

This could happen with inherently massive operations such as copy-tables used as a starting step of database export/backup.

Coping large number of tables with huge number of partitions can result in so large TTxOperationPropose local transaction that its size would hit the limit imposed by the tablet executor. Tablet violating that limit is considered broken and will be immediately stopped.
See ydb/core/tablet_flat/flat_executor.cpp, NTabletFlatExecutor::TExecutor::ExecuteTransaction().

KIKIMR-21751
@ijon ijon requested a review from a team as a code owner July 18, 2024 20:16
@ijon ijon linked an issue Jul 18, 2024 that may be closed by this pull request
@github-actions
Copy link

github-actions bot commented Jul 18, 2024

2024-07-18 20:19:54 UTC Pre-commit check for c35e13e has started.
2024-07-18 20:22:21 UTC Build linux-x86_64-release-asan is running...
🟢 2024-07-18 20:47:19 UTC Build successful.
2024-07-18 20:47:35 UTC Tests are running...
🔴 2024-07-18 22:47:18 UTC Some tests failed, follow the links below.

Test history | Test log

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
9819 9249 0 82 316 172

🟢 2024-07-18 22:48:05 UTC ydbd size 5.2 GiB changed* by +62.4 KiB, which is < 100.0 KiB vs stable-24-3: OK

ydbd size dash stable-24-3: df15684 merge: c35e13e diff diff %
ydbd size 5 616 613 272 Bytes 5 616 677 192 Bytes +62.4 KiB +0.001%
ydbd stripped size 1 211 263 016 Bytes 1 211 276 264 Bytes +12.9 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Jul 18, 2024

2024-07-18 20:20:07 UTC Pre-commit check for c35e13e has started.
2024-07-18 20:22:35 UTC Build linux-x86_64-release-clang14 is running...
🟢 2024-07-18 20:30:30 UTC Build successful.

@github-actions
Copy link

github-actions bot commented Jul 18, 2024

2024-07-18 20:20:19 UTC Pre-commit check for c35e13e has started.
2024-07-18 20:22:48 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-07-18 21:02:09 UTC Build successful.
2024-07-18 21:02:26 UTC Tests are running...
🟢 2024-07-18 22:23:26 UTC Tests successful.

Test history | Test log

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14091 12770 0 0 1309 12

🟢 2024-07-18 22:24:06 UTC ydbd size 8.1 GiB changed* by +46.6 KiB, which is < 100.0 KiB vs stable-24-3: OK

ydbd size dash stable-24-3: df15684 merge: c35e13e diff diff %
ydbd size 8 727 367 992 Bytes 8 727 415 696 Bytes +46.6 KiB +0.001%
ydbd stripped size 477 191 944 Bytes 477 200 200 Bytes +8.1 KiB +0.002%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

schemeshard: reject too massive operations

2 participants