Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reinit step for build vector index #9176

Merged
merged 7 commits into from
Sep 20, 2024

Conversation

MBkkt
Copy link
Collaborator

@MBkkt MBkkt commented Sep 12, 2024

Main loop for vector index build looks like:

shards = mainTable shards;
for (int level = 0; level < neededLevels; ++level) {
   outputTable = postingTable;
   if ((level + 1) != neededLevels) {
      outputTable = tmp level % 2 table
      outputTable.clear()
   }
   for (int parent = 0; parent < k^level; ++parent) {
    if (parent is local for some shard) {
      run LocalKmeans
    } else {
      // for all shards responsible for this parent
      run SampleK
      n times run RecomputeEmbeddings
      run ReshuffleEmbeddings
    }
   }
   shards = outputTable shards;
}

This PR implements this part

for (int level = 0; level < neededLevels; ++level) {
   outputTable = postingTable;
   if ((level + 1) != neededLevels) {
      outputTable = tmp level % 2 table
      outputTable.clear()
   }

It's about make implicit cycle over levels and clear temprorary output tables.
Also we know expected counted of shards for temprorary tables very well, so we also specifying partition

@MBkkt MBkkt requested a review from ijon September 12, 2024 17:02

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

@MBkkt MBkkt mentioned this pull request Sep 13, 2024
17 tasks
@MBkkt MBkkt self-assigned this Sep 13, 2024
@MBkkt MBkkt changed the title Add reinit step for build index Add reinit step for build vector index Sep 13, 2024

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

@MBkkt MBkkt force-pushed the mbkkt/vector-index-build-main-loop branch from 6e4138e to 751776b Compare September 16, 2024 15:41

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

Copy link

github-actions bot commented Sep 17, 2024

2024-09-17 11:01:04 UTC Pre-commit check linux-x86_64-release-asan for d595e3f has started.
2024-09-17 11:01:08 UTC Artifacts will be uploaded here
2024-09-17 11:04:04 UTC ya make is running...
🔴 2024-09-17 13:03:45 UTC Some tests failed, follow the links below.

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
10541 10123 0 69 304 45

🟢 2024-09-17 13:04:43 UTC Build successful.
🟢 2024-09-17 13:05:21 UTC ydbd size 5.6 GiB changed* by +97.1 KiB, which is < 100.0 KiB vs main: OK

ydbd size dash main: 238a753 merge: d595e3f diff diff %
ydbd size 6 045 711 792 Bytes 6 045 811 208 Bytes +97.1 KiB +0.002%
ydbd stripped size 1 513 156 368 Bytes 1 513 179 600 Bytes +22.7 KiB +0.002%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Sep 17, 2024

2024-09-17 11:01:09 UTC Pre-commit check linux-x86_64-relwithdebinfo for d595e3f has started.
2024-09-17 11:01:18 UTC Artifacts will be uploaded here
2024-09-17 11:04:10 UTC ya make is running...
🟡 2024-09-17 12:13:24 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
39856 34153 0 1 5665 37

2024-09-17 12:16:48 UTC ya make is running... (failed tests rerun, try 2)
🟢 2024-09-17 12:27:29 UTC Tests successful.

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
275 (only retried tests) 179 0 0 7 89

🟢 2024-09-17 12:27:39 UTC Build successful.
🟡 2024-09-17 12:28:16 UTC ydbd size 8.4 GiB changed* by +710.9 KiB, which is >= 100.0 KiB vs main: Warning

ydbd size dash main: 8dfd693 merge: d595e3f diff diff %
ydbd size 9 030 406 616 Bytes 9 031 134 624 Bytes +710.9 KiB +0.008%
ydbd stripped size 488 733 032 Bytes 488 772 264 Bytes +38.3 KiB +0.008%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Sep 17, 2024

2024-09-17 11:01:09 UTC Pre-commit check linux-x86_64-release-clang14 for d595e3f has started.
2024-09-17 11:01:19 UTC Artifacts will be uploaded here
2024-09-17 11:04:12 UTC ya make is running...
🟢 2024-09-17 11:21:48 UTC Build successful.

@MBkkt MBkkt requested a review from ijon September 20, 2024 11:46
Copy link

github-actions bot commented Sep 20, 2024

2024-09-20 12:07:22 UTC Pre-commit check linux-x86_64-release-clang14 for 711aae9 has started.
2024-09-20 12:08:05 UTC Artifacts will be uploaded here
2024-09-20 12:11:36 UTC ya make is running...
🟢 2024-09-20 12:56:41 UTC Build successful.

Copy link

github-actions bot commented Sep 20, 2024

2024-09-20 12:07:26 UTC Pre-commit check linux-x86_64-relwithdebinfo for 711aae9 has started.
2024-09-20 12:07:30 UTC Artifacts will be uploaded here
2024-09-20 12:10:42 UTC ya make is running...
🟡 2024-09-20 13:51:48 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
41274 35717 0 4 5456 97

2024-09-20 13:55:19 UTC ya make is running... (failed tests rerun, try 2)
🟢 2024-09-20 14:05:52 UTC Tests successful.

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
110 (only retried tests) 22 0 0 0 88

🟢 2024-09-20 14:05:59 UTC Build successful.
🟡 2024-09-20 14:06:42 UTC ydbd size 8.4 GiB changed* by +156.1 KiB, which is >= 100.0 KiB vs main: Warning

ydbd size dash main: 1d31583 merge: 711aae9 diff diff %
ydbd size 9 042 683 432 Bytes 9 042 843 248 Bytes +156.1 KiB +0.002%
ydbd stripped size 489 300 104 Bytes 489 306 120 Bytes +5.9 KiB +0.001%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Sep 20, 2024

2024-09-20 12:17:11 UTC Pre-commit check linux-x86_64-release-asan for 711aae9 has started.
2024-09-20 12:17:51 UTC Artifacts will be uploaded here
2024-09-20 12:21:57 UTC ya make is running...
🔴 2024-09-20 14:50:20 UTC Some tests failed, follow the links below.

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
11904 11506 0 65 282 51

🟢 2024-09-20 14:51:25 UTC Build successful.
🟡 2024-09-20 14:52:02 UTC ydbd size 5.6 GiB changed* by +142.0 KiB, which is >= 100.0 KiB vs main: Warning

ydbd size dash main: 1d31583 merge: 711aae9 diff diff %
ydbd size 6 061 096 568 Bytes 6 061 242 016 Bytes +142.0 KiB +0.002%
ydbd stripped size 1 516 186 096 Bytes 1 516 213 104 Bytes +26.4 KiB +0.002%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants