Skip to content

Conversation

@lex007in
Copy link
Collaborator

Changelog entry

...

Changelog category

  • Not for changelog (changelog entry is not required)

Description for reviewers

...

@github-actions
Copy link

github-actions bot commented Apr 18, 2025

🟢 2025-04-25 15:46:36 UTC The validation of the Pull Request description is successful.

@lex007in lex007in force-pushed the cleanup_tablet_gc_retry branch from 6ef1ac7 to b8cf79f Compare April 18, 2025 01:36
@github-actions
Copy link

github-actions bot commented Apr 18, 2025

2025-04-18 02:11:48 UTC Pre-commit check linux-x86_64-relwithdebinfo for d795d6c has started.
2025-04-18 02:11:54 UTC Artifacts will be uploaded here
2025-04-18 02:14:46 UTC ya make is running...
🟡 2025-04-18 03:32:24 UTC Some tests failed, follow the links below. Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
27432 24745 0 1 2572 114

2025-04-18 03:34:45 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-04-18 03:46:47 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
176 (only retried tests) 80 0 0 0 96

🟢 2025-04-18 03:46:54 UTC Build successful.
🟢 2025-04-18 03:47:13 UTC ydbd size 2.2 GiB changed* by +5.7 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 1f62eb9 merge: d795d6c diff diff %
ydbd size 2 363 541 544 Bytes 2 363 547 392 Bytes +5.7 KiB +0.000%
ydbd stripped size 493 864 032 Bytes 493 864 864 Bytes +832 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Apr 18, 2025

2025-04-18 02:12:15 UTC Pre-commit check linux-x86_64-release-asan for d795d6c has started.
2025-04-18 02:12:23 UTC Artifacts will be uploaded here
2025-04-18 02:15:15 UTC ya make is running...
🟡 2025-04-18 03:34:08 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
12802 12633 0 108 30 31

2025-04-18 03:35:18 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-04-18 03:48:53 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
219 (only retried tests) 130 0 50 15 24

2025-04-18 03:49:01 UTC ya make is running... (failed tests rerun, try 3)
🟡 2025-04-18 04:01:24 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
121 (only retried tests) 60 0 39 2 20

🟢 2025-04-18 04:01:31 UTC Build successful.
🟢 2025-04-18 04:02:01 UTC ydbd size 3.9 GiB changed* by +11.0 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 1f62eb9 merge: d795d6c diff diff %
ydbd size 4 147 399 600 Bytes 4 147 410 912 Bytes +11.0 KiB +0.000%
ydbd stripped size 1 431 878 232 Bytes 1 431 882 008 Bytes +3.7 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@lex007in lex007in self-assigned this Apr 18, 2025
@lex007in lex007in force-pushed the cleanup_tablet_gc_retry branch from b8cf79f to 0a14d96 Compare April 21, 2025 00:13
@github-actions
Copy link

github-actions bot commented Apr 21, 2025

2025-04-21 01:10:06 UTC Pre-commit check linux-x86_64-relwithdebinfo for a79225a has started.
2025-04-21 01:10:55 UTC Artifacts will be uploaded here
2025-04-21 01:14:25 UTC ya make is running...
🟡 2025-04-21 02:24:10 UTC Some tests failed, follow the links below. Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
27520 24830 0 6 2574 110

2025-04-21 02:26:23 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-04-21 02:38:27 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
178 (only retried tests) 81 0 0 0 97

🟢 2025-04-21 02:38:34 UTC Build successful.
🟢 2025-04-21 02:38:59 UTC ydbd size 2.2 GiB changed* by +12.1 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: c590a01 merge: a79225a diff diff %
ydbd size 2 363 960 776 Bytes 2 363 973 200 Bytes +12.1 KiB +0.001%
ydbd stripped size 493 946 624 Bytes 493 948 352 Bytes +1.7 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Apr 21, 2025

2025-04-21 02:11:35 UTC Pre-commit check linux-x86_64-release-asan for a79225a has started.
2025-04-21 02:11:41 UTC Artifacts will be uploaded here
2025-04-21 02:14:31 UTC ya make is running...
🟡 2025-04-21 03:39:43 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
12883 12719 0 102 36 26

2025-04-21 03:40:48 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-04-21 03:54:45 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
230 (only retried tests) 150 0 39 16 25

2025-04-21 03:54:54 UTC ya make is running... (failed tests rerun, try 3)
🟡 2025-04-21 04:07:22 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
101 (only retried tests) 43 0 33 1 24

🟢 2025-04-21 04:07:31 UTC Build successful.
🟢 2025-04-21 04:08:09 UTC ydbd size 3.9 GiB changed* by +24.4 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: c590a01 merge: a79225a diff diff %
ydbd size 4 148 273 440 Bytes 4 148 298 472 Bytes +24.4 KiB +0.001%
ydbd stripped size 1 432 101 400 Bytes 1 432 110 840 Bytes +9.2 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@lex007in lex007in force-pushed the cleanup_tablet_gc_retry branch from 0a14d96 to 4e33302 Compare April 21, 2025 08:08
@github-actions
Copy link

github-actions bot commented Apr 21, 2025

2025-04-21 08:10:38 UTC Pre-commit check linux-x86_64-relwithdebinfo for bf50e66 has started.
2025-04-21 08:10:53 UTC Artifacts will be uploaded here
2025-04-21 08:13:39 UTC ya make is running...

@github-actions
Copy link

github-actions bot commented Apr 21, 2025

2025-04-21 08:10:49 UTC Pre-commit check linux-x86_64-release-asan for bf50e66 has started.
2025-04-21 08:11:05 UTC Artifacts will be uploaded here
2025-04-21 08:13:50 UTC ya make is running...
2025-04-21 09:16:19 UTC Check cancelled

@lex007in lex007in force-pushed the cleanup_tablet_gc_retry branch from 4e33302 to ff2fad2 Compare April 21, 2025 09:15
@github-actions
Copy link

github-actions bot commented Apr 21, 2025

2025-04-21 09:23:12 UTC Pre-commit check linux-x86_64-relwithdebinfo for e30302a has started.
2025-04-21 09:23:28 UTC Artifacts will be uploaded here
2025-04-21 09:26:14 UTC ya make is running...
🟡 2025-04-21 10:46:08 UTC Some tests failed, follow the links below. Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
27520 24828 0 3 2575 114

2025-04-21 10:48:27 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-04-21 11:00:34 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
181 (only retried tests) 82 0 0 0 99

🟢 2025-04-21 11:00:41 UTC Build successful.
🟢 2025-04-21 11:01:01 UTC ydbd size 2.2 GiB changed* by +12.4 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: c590a01 merge: e30302a diff diff %
ydbd size 2 363 960 776 Bytes 2 363 973 440 Bytes +12.4 KiB +0.001%
ydbd stripped size 493 946 624 Bytes 493 948 480 Bytes +1.8 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Apr 21, 2025

2025-04-21 09:24:53 UTC Pre-commit check linux-x86_64-release-asan for e30302a has started.
2025-04-21 09:25:09 UTC Artifacts will be uploaded here
2025-04-21 09:28:04 UTC ya make is running...
🟡 2025-04-21 11:06:03 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
12883 12689 0 136 32 26

2025-04-21 11:07:12 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-04-21 11:20:42 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
256 (only retried tests) 186 0 36 10 24

2025-04-21 11:20:51 UTC ya make is running... (failed tests rerun, try 3)
🟡 2025-04-21 11:33:20 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
93 (only retried tests) 35 0 33 3 22

🟢 2025-04-21 11:33:28 UTC Build successful.
🟢 2025-04-21 11:34:00 UTC ydbd size 3.9 GiB changed* by +24.6 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: c6a5181 merge: e30302a diff diff %
ydbd size 4 148 273 664 Bytes 4 148 298 840 Bytes +24.6 KiB +0.001%
ydbd stripped size 1 432 101 528 Bytes 1 432 111 096 Bytes +9.3 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@lex007in lex007in marked this pull request as ready for review April 21, 2025 10:53
@lex007in lex007in requested a review from snaury April 21, 2025 10:53
GcLogic->OnCollectGarbageResult(ev);
DataCleanupLogic->OnCollectedGarbage(OwnerCtx());

const bool needRetryFailed = DataCleanupLogic->NeedGC();
Copy link
Member

@snaury snaury Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Смущает:

  1. полезно ретраить в принципе всегда, почему только если есть GC в data cleanup logic?
  2. этот код про ретраи почему-то выполняется всегда, даже если в ответе нет никаких ошибок, и неочевидно, что на самом деле решения о ретраях принимает вызов TryScheduleGcRequestRetries

Возможно стоит решение о ретраях принимать в вызове GcLogic->OnCollectGarbageResult, и собственно возвращать в результате на сколько шедулить этот ретрай, если нужно. Тогда вся логика будет внутри gc logic, и не будет такого, что часть этой логики почему-то в таблетке снаружи.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Тут хотел между OnCollectGarbageResult и TryScheduleGcRequestRetries ещё спрашивать у DataCleanupLogic "а точно ли ещё нужнен GC". Но если ретраить всегда, то можно упростить логику. Переделаю, чтоб всегда ретраить.

if (GcWaitFor == 0) { // all channel's GC completed
if (CommitedGcBarrier == KnownGcBarrier && TryCounter > 0) { // all GC requests succeeded and we must reset try counter
TryCounter = 0;
BackoffTimer.Reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Какая-то сложная для понимая логика со сравнением барьеров и т.д. Я бы сделал более явно:

  • В OnCollectGarbageFailure инкрементил счётчик ошибок в текущем запросе
  • В OnCollectGarbageSuccess там же CollectSent.Clear() делать вот это обнуление, если не было ошибок (а их там и не могло быть, т.к. в случае ошибок делается CollectSent.Clear() и мы туда не дойдём)
  • В SendCollectGarbage занулять счётчик ошибок

Тогда в этом методе можно будет просто проверять счётчик ошибок и если они были - то шедулить запрос.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Да, зануление можно действительно перенести в OnCollectGarbageSuccess.
Отдельный счётчик ошибок не стал делать, так как посчитал, что текущей информации достаточно. Но можно сделать и явно, да.

}

void TExecutorGCLogic::TChannelInfo::RetryGcRequests(const TTabletStorageInfo *tabletStorageInfo, ui32 channel, ui32 generation, const TActorContext& ctx) {
RetryIsScheduled = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Мне кажется здесь нужно проверять, что этот флаг установлен (и ничего не делать, если не установлен). Плюс сбрасывать этот флаг, если например SendCollectGarbage вызвался до срабатывания таймера (мы уже новый запрос отправили, если нужно, и ретраить после этого всё-равно нужно новый запрос, а не старый).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ну тут логика, что при очередном SendCollectGarbage и ошибке происходит не перепланирование ретрая (как ты по сути предлагаешь), а просто повторные ретраи игнорируются, пока не обработался первый ретрай. В этом варианте создаётся меньше событий EvRetryGcRequest (так как, условно, более старое событие переиспользуется для более новых ошибок, которые возникли к моменту обработки), но ретраи могут происходят чаще (так как не зависят от конкуретных нормальных GC)

@github-actions
Copy link

github-actions bot commented Apr 24, 2025

2025-04-24 19:33:02 UTC Pre-commit check linux-x86_64-release-asan for 0769452 has started.
2025-04-24 19:33:17 UTC Artifacts will be uploaded here
2025-04-24 19:36:03 UTC ya make is running...
🟡 2025-04-24 20:55:29 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
13034 12876 0 92 42 24

2025-04-24 20:56:35 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-04-24 21:09:49 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
198 (only retried tests) 122 0 37 16 23

2025-04-24 21:09:57 UTC ya make is running... (failed tests rerun, try 3)
🟡 2025-04-24 21:22:18 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
99 (only retried tests) 42 0 33 2 22

🟢 2025-04-24 21:22:25 UTC Build successful.
🟢 2025-04-24 21:22:54 UTC ydbd size 3.9 GiB changed* by +22.6 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 5cbafaa merge: 0769452 diff diff %
ydbd size 4 136 371 664 Bytes 4 136 394 800 Bytes +22.6 KiB +0.001%
ydbd stripped size 1 432 042 264 Bytes 1 432 051 096 Bytes +8.6 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Apr 24, 2025

2025-04-24 19:33:23 UTC Pre-commit check linux-x86_64-relwithdebinfo for 0769452 has started.
2025-04-24 19:33:40 UTC Artifacts will be uploaded here
2025-04-24 19:36:29 UTC ya make is running...
🟡 2025-04-24 20:51:40 UTC Some tests failed, follow the links below. Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
27772 25076 0 4 2650 42

2025-04-24 20:54:08 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-04-24 21:14:10 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
106 (only retried tests) 74 0 0 0 32

🟢 2025-04-24 21:14:17 UTC Build successful.
🟢 2025-04-24 21:14:42 UTC ydbd size 2.2 GiB changed* by +11.4 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 5cbafaa merge: 0769452 diff diff %
ydbd size 2 354 601 952 Bytes 2 354 613 640 Bytes +11.4 KiB +0.000%
ydbd stripped size 494 239 168 Bytes 494 240 832 Bytes +1.6 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@lex007in lex007in requested a review from snaury April 25, 2025 00:02
@lex007in lex007in force-pushed the cleanup_tablet_gc_retry branch from 983e2ec to 745cc6f Compare April 25, 2025 13:04
@github-actions
Copy link

github-actions bot commented Apr 25, 2025

2025-04-25 13:24:40 UTC Pre-commit check linux-x86_64-relwithdebinfo for c7452c4 has started.
2025-04-25 13:25:01 UTC Artifacts will be uploaded here
2025-04-25 13:28:22 UTC ya make is running...
🟡 2025-04-25 14:46:18 UTC Some tests failed, follow the links below. Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
27790 25117 0 3 2630 40

2025-04-25 14:48:34 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-04-25 14:59:08 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
80 (only retried tests) 53 0 0 0 27

🟢 2025-04-25 14:59:15 UTC Build successful.
🟢 2025-04-25 14:59:38 UTC ydbd size 2.2 GiB changed* by +7.1 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 91f69c4 merge: c7452c4 diff diff %
ydbd size 2 342 094 976 Bytes 2 342 102 296 Bytes +7.1 KiB +0.000%
ydbd stripped size 492 572 032 Bytes 492 573 120 Bytes +1.1 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Apr 25, 2025

2025-04-25 13:24:47 UTC Pre-commit check linux-x86_64-release-asan for c7452c4 has started.
2025-04-25 13:25:23 UTC Artifacts will be uploaded here
2025-04-25 13:28:41 UTC ya make is running...
🟡 2025-04-25 14:44:17 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
13052 12889 0 93 44 26

2025-04-25 14:45:27 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-04-25 14:58:37 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
207 (only retried tests) 125 0 39 18 25

2025-04-25 14:58:45 UTC ya make is running... (failed tests rerun, try 3)
🟡 2025-04-25 15:11:52 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
105 (only retried tests) 47 0 34 1 23

🟢 2025-04-25 15:11:59 UTC Build successful.
🟢 2025-04-25 15:12:32 UTC ydbd size 3.8 GiB changed* by +15.4 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 48459a5 merge: c7452c4 diff diff %
ydbd size 4 120 128 952 Bytes 4 120 144 704 Bytes +15.4 KiB +0.000%
ydbd stripped size 1 428 472 952 Bytes 1 428 479 928 Bytes +6.8 KiB +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@lex007in lex007in merged commit 4d2bcbc into ydb-platform:main Apr 25, 2025
14 checks passed
@lex007in lex007in deleted the cleanup_tablet_gc_retry branch April 25, 2025 15:34
@lex007in lex007in changed the title LocalDB: Retry failed GC requests during DataCleanup LocalDB: Retry failed GC requests Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants