Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: vector index description wasn't persisted #11969

Merged
merged 5 commits into from
Nov 27, 2024

Conversation

MBkkt
Copy link
Collaborator

@MBkkt MBkkt commented Nov 25, 2024

No description provided.

@MBkkt MBkkt requested a review from ijon November 25, 2024 17:14

This comment was marked as outdated.

This comment was marked as outdated.

@MBkkt MBkkt force-pushed the mbkkt/fix-vector-index-desc branch from 6ffde9b to 6964950 Compare November 25, 2024 17:33

This comment was marked as outdated.

This comment was marked as outdated.

Copy link

github-actions bot commented Nov 25, 2024

2024-11-25 20:01:37 UTC Pre-commit check linux-x86_64-release-asan for 0ab8793 has started.
2024-11-25 20:01:47 UTC Artifacts will be uploaded here
2024-11-25 20:04:53 UTC ya make is running...
🟡 2024-11-25 21:40:03 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
9146 9067 0 20 13 46

🟢 2024-11-25 21:40:49 UTC Build successful.
🟢 2024-11-25 21:41:18 UTC ydbd size 4.9 GiB changed* by +89.3 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 7644097 merge: 0ab8793 diff diff %
ydbd size 5 285 400 112 Bytes 5 285 491 512 Bytes +89.3 KiB +0.002%
ydbd stripped size 1 360 266 704 Bytes 1 360 289 200 Bytes +22.0 KiB +0.002%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Nov 25, 2024

2024-11-25 20:04:57 UTC Pre-commit check linux-x86_64-relwithdebinfo for 0ab8793 has started.
2024-11-25 20:05:32 UTC Artifacts will be uploaded here
2024-11-25 20:09:08 UTC ya make is running...
🟢 2024-11-25 21:15:31 UTC Tests successful.

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
24890 22059 0 0 2719 112

🟢 2024-11-25 21:17:41 UTC Build successful.
🟢 2024-11-25 21:18:00 UTC ydbd size 2.5 GiB changed* by +83.3 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: a328b30 merge: 0ab8793 diff diff %
ydbd size 2 690 474 568 Bytes 2 690 559 848 Bytes +83.3 KiB +0.003%
ydbd stripped size 482 239 248 Bytes 482 249 136 Bytes +9.7 KiB +0.002%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@MBkkt MBkkt self-assigned this Nov 26, 2024
@@ -35,6 +35,7 @@ NKikimrSchemeOp::TModifyScheme CreateIndexTask(NKikimr::NSchemeShard::TTableInde
operation->SetName(dst.LeafName());

operation->SetType(indexInfo->Type);
Y_ENSURE(indexInfo->Type != NKikimrSchemeOp::EIndexType::EIndexTypeGlobalVectorKmeansTree);
Copy link
Collaborator

@ijon ijon Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Y_ENSURE throws an exception and that is not appropriate for the schemeshard. At least, exceptions are not used in the schemeshard. (And there is a little sense in introducing new way of acting on code assumptions violation as a sidecar of a bugfix).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem, I don't like the idea of ​​schemeshard aborting if something is written incorrectly and we pass the wrong index type in some schemeshard operation, I prefer abort this operation instead.

Do you have any suggestions on what I can replace this "ensure" code with?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -1043,9 +1044,10 @@ struct Schema : NIceDb::Schema {
struct AlterVersion : Column<3, NScheme::NTypeIds::Uint64> {};
struct IndexType : Column<4, NScheme::NTypeIds::Uint32> { using Type = NKikimrSchemeOp::EIndexType; static constexpr Type Default = NKikimrSchemeOp::EIndexTypeInvalid; };
struct State : Column<5, NScheme::NTypeIds::Uint32> { using Type = NKikimrSchemeOp::EIndexState; static constexpr Type Default = NKikimrSchemeOp::EIndexStateInvalid; };
struct Description : Column<6, NScheme::NTypeIds::String> {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a tail comment with the type that is stored in that field.

Not all fields that are intended to store serialized protobuf messages are marked in this way, but many are, and that is very helpful.

Copy link
Collaborator Author

@MBkkt MBkkt Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -1007,7 +1008,8 @@ struct TSchemeShard::TTxInit : public TTransactionBase<TSchemeShard> {
}
while (!rowSet.EndOfSet()) {
const auto pathId = Self->MakeLocalId(TLocalPathId(rowSet.GetValue<Schema::TableIndex::PathId>()));
tableIndexes.push_back(MakeTableIndexRec<Schema::TableIndex>(pathId, rowSet));
auto& back = tableIndexes.emplace_back(MakeTableIndexRec<Schema::TableIndex>(pathId, rowSet));
std::get<4>(back) = rowSet.GetValue<Schema::TableIndex::Description>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected either having different variants of MakeTableIndexRec() for a different number of columns or a single variadic MakeTableIndexRec().
But not a special treatment like that.

Copy link
Collaborator Author

@MBkkt MBkkt Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think current code is more compact, but ok, I will change it
Changed

@@ -2371,11 +2371,16 @@ struct TTableIndexInfo : public TSimpleRefCount<TTableIndexInfo> {
using EType = NKikimrSchemeOp::EIndexType;
using EState = NKikimrSchemeOp::EIndexState;

TTableIndexInfo(ui64 version, EType type, EState state)
TTableIndexInfo(ui64 version, EType type, EState state, std::string_view description)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use std::string_view here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my head it's commonly best way to pass something like const string reference.
Also ParseFromString uses string_view (abseil but it's kind of same):

bool ParseFromString(y_absl::string_view data);

Do you have some other preference? What and why?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const TString&.
We don't store original argument value in the object and don't construct TTableIndexInfo from anything but TString value, so string_view doesn't bring any benefits here.

@@ -2391,8 +2396,20 @@ struct TTableIndexInfo : public TSimpleRefCount<TTableIndexInfo> {
return result;
}

TString DescriptionToStr() const {
return std::visit([](const auto& v) {
if constexpr (requires { v.SerializeAsString(); }) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this check is necessary? I guess, we expect all variants of SpecializedIndexDescription to be of protobuf message type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect monostate for global secondary indexes, but I agree that is maybe better to check this instead

@@ -2391,8 +2396,20 @@ struct TTableIndexInfo : public TSimpleRefCount<TTableIndexInfo> {
return result;
}

TString DescriptionToStr() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToStr has a misleading connotation. According to our codebase it doesn't have a meaning of "encode value to a sequence of bytes" but rather "make something printable to be viewed by a person".
I would recommend using something more accurate: SerializedDescription or something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, changed to SerializeDescription, not SerializedDescription because I want it to be clear that it's not just getter

Copy link

github-actions bot commented Nov 26, 2024

2024-11-26 19:19:52 UTC Pre-commit check linux-x86_64-release-asan for 9b00599 has started.
2024-11-26 19:20:16 UTC Artifacts will be uploaded here
2024-11-26 19:23:10 UTC ya make is running...
2024-11-26 20:21:58 UTC Check cancelled

Copy link

github-actions bot commented Nov 26, 2024

2024-11-26 19:20:33 UTC Pre-commit check linux-x86_64-relwithdebinfo for 9b00599 has started.
2024-11-26 19:20:44 UTC Artifacts will be uploaded here
2024-11-26 19:23:40 UTC ya make is running...
🟡 2024-11-26 20:21:02 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
15515 14004 0 2 1396 113

🟢 2024-11-26 20:22:06 UTC ydbd size 2.5 GiB changed* by +39.5 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: f4d6825 merge: 9b00599 diff diff %
ydbd size 2 694 619 000 Bytes 2 694 659 408 Bytes +39.5 KiB +0.001%
ydbd stripped size 482 600 784 Bytes 482 604 560 Bytes +3.7 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation
2024-11-26 20:22:07 UTC Check cancelled

Copy link

github-actions bot commented Nov 26, 2024

2024-11-26 20:22:52 UTC Pre-commit check linux-x86_64-release-asan for 4ddb7dc has started.
2024-11-26 20:23:14 UTC Artifacts will be uploaded here
2024-11-26 20:26:08 UTC ya make is running...
🟡 2024-11-26 21:32:36 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
9105 9023 0 20 9 53

🟢 2024-11-26 21:33:22 UTC Build successful.
🟢 2024-11-26 21:33:50 UTC ydbd size 4.9 GiB changed* by +78.0 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: f4d6825 merge: 4ddb7dc diff diff %
ydbd size 5 293 855 752 Bytes 5 293 935 600 Bytes +78.0 KiB +0.002%
ydbd stripped size 1 361 407 056 Bytes 1 361 420 880 Bytes +13.5 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Nov 26, 2024

2024-11-26 20:25:50 UTC Pre-commit check linux-x86_64-relwithdebinfo for 4ddb7dc has started.
2024-11-26 20:26:01 UTC Artifacts will be uploaded here
2024-11-26 20:28:57 UTC ya make is running...
🟡 2024-11-26 21:20:28 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
15516 14007 0 1 1395 113

2024-11-26 21:21:45 UTC ya make is running... (failed tests rerun, try 2)
🟢 2024-11-26 21:33:34 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
112 (only retried tests) 6 0 0 0 106

🟢 2024-11-26 21:33:41 UTC Build successful.
🟢 2024-11-26 21:34:00 UTC ydbd size 2.5 GiB changed* by +45.5 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: f4d6825 merge: 4ddb7dc diff diff %
ydbd size 2 694 619 000 Bytes 2 694 665 584 Bytes +45.5 KiB +0.002%
ydbd stripped size 482 600 784 Bytes 482 605 712 Bytes +4.8 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@MBkkt MBkkt merged commit cb1675a into ydb-platform:main Nov 27, 2024
10 checks passed
@MBkkt MBkkt mentioned this pull request Jan 16, 2025
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants