rebuild index for new prop #3332

sworduo · 2021-11-18T12:05:31Z

What type of PR is this?

bug
feature
enhancement

What does this PR do?

rebuild tag index with old schema version value

Which issue(s)/PR(s) this PR relates to?

#3274 (reference)

Special notes for your reviewer, ex. impact of this fix, etc:

Additional context:

Checklist：

Documentation affected （Please add the label if documentation needs to be modified.)
Incompatible （If it is incompatible, please describe it and add corresponding label.）
Need to cherry-pick （If need to cherry-pick to some branches, please label the destination version(s).)
Performance impacted: Consumes more CPU/Memory

Release notes：

Please confirm whether to reflect in release notes and how to describe:

critical27 · 2021-11-22T02:08:46Z

Thx for contribution, does this PR origin from https://discuss.nebula-graph.com.cn/t/topic/6376?

sworduo · 2021-11-22T02:10:05Z

Thx for contribution, does this PR origin from https://discuss.nebula-graph.com.cn/t/topic/6376?

Yes

critical27

I have a suggestion to fix this issue:

Perhaps don't modify the RowReader or RowReaderWrapper, because RowReader is bound to a specific schema by design, and the setLatestSchema method will introduce meny problem. (e.g. read from an old schema by latest schema will have undefined behaviors)
To fix this issue, when we call reader->getValueByName in IndexKeyUtils::collectIndexValues, we will return NullType::UNKNOWN_PROP for a not existed property in old schema (see line 55 in RowReaderV2.cpp). When you find the value is unknown, we can use latest schema to check if it is nullable or has default value:
- return error when it isn't nullable and does not have default value
- or use the predefined value to build the index

sworduo · 2021-11-22T02:40:02Z

I have a suggestion to fix this issue:

Perhaps don't modify the RowReader or RowReaderWrapper, because RowReader is bound to a specific schema by design, and the setLatestSchema method will introduce meny problem. (e.g. read from an old schema by latest schema will have undefined behaviors)

To fix this issue, when we call reader->getValueByName in IndexKeyUtils::collectIndexValues, we will return NullType::UNKNOWN_PROP for a not existed property in old schema (see line 55 in RowReaderV2.cpp). When you find the value is unknown, we can use latest schema to check if it is nullable or has default value:

return error when it isn't nullable and does not have default value

or use the predefined value to build the index

1.The scheme used to read data is the schema included in the row data instead of the latest schema. The latest schema is only used in IndexKeyUtils::collectIndexValues when the schema included in the data is different from the latest schema. Besides, since the index will be deleted when the associated prop is deleted, I'm not very sure what problem will be introduced.

2.In this case, a new parameters named latestSchema need to be added in IndexKeyUtils::collectIndexValues and everywhere call this function need to modify as well. Of course, that's OK.

sworduo · 2021-11-22T02:42:04Z

By the way, I wonder how to call pre-commit by my-self before commit...

critical27 · 2021-11-22T02:45:41Z

By the way, I wonder how to call pre-commit by my-self before commit...

You mean the lint failed? We need to do the git clang-format

critical27 · 2021-11-22T03:06:14Z

1.The scheme used to read data is the schema included in the row data instead of the latest schema. The latest schema is only used in IndexKeyUtils::collectIndexValues when the schema included in the data is different from the latest schema. Besides, since the index will be deleted when the associated prop is deleted, I'm not very sure what problem will be introduced.

2.In this case, a new parameters named latestSchema need to be added in IndexKeyUtils::collectIndexValues and everywhere call this function need to modify as well. Of course, that's OK.

Perhaps I didn't clarify very clearly. The reason I don't suggest to add setLatestSchema is we don't need to do it (RowReader should not have latest schema). And the for each time getTagPropReader and getEdgePropReader we need to call schemaMan->getLatestTagSchemaVersion, it could raise a performance issue. The latest schema only need to be fetched once.

As for a new parameters named latestSchema you said, either use a default nullptr parameter, or add a overloaded version which has 3 parameters of IndexKeyUtils::collectIndexValues. Both ways LGTM.

sworduo · 2021-11-22T03:18:36Z

1.The scheme used to read data is the schema included in the row data instead of the latest schema. The latest schema is only used in IndexKeyUtils::collectIndexValues when the schema included in the data is different from the latest schema. Besides, since the index will be deleted when the associated prop is deleted, I'm not very sure what problem will be introduced.
2.In this case, a new parameters named latestSchema need to be added in IndexKeyUtils::collectIndexValues and everywhere call this function need to modify as well. Of course, that's OK.

Perhaps I didn't clarify very clearly. The reason I don't suggest to add setLatestSchema is we don't need to do it (RowReader should not have latest schema). And the for each time getTagPropReader and getEdgePropReader we need to call schemaMan->getLatestTagSchemaVersion, it could raise a performance issue. The latest schema only need to be fetched once.

As for a new parameters named latestSchema you said, either use a default nullptr parameter, or add a overloaded version which has 3 parameters of IndexKeyUtils::collectIndexValues. Both ways LGTM.

That's OK. The point is to avoid call getLatestSchem lots of time.

critical27

Good job, only one thing to check, please see the inline comments.

PS: some lint failed, please fix them according to https://github.com/vesoft-inc/nebula/runs/4319257516?check_suite_focus=true

critical27 · 2021-11-25T02:38:21Z

src/common/utils/IndexKeyUtils.cpp

+                                       const std::string propName,
+                                       const meta::SchemaProviderIf* latestSchema) {
+  auto value = reader->getValueByName(propName);
+  if (latestSchema == nullptr


Perhaps there is one more condition here to check if the value is a "Null" (literally, sorry about the naming). You only handle the UNKNOWN_PROP.

See the code in RowReader

The condition NullType::NULL is handled by IndexKeyUtils::checkValue. Maybe I don't need to handle it in function readValueWithLatestSche.

I mean if the value in RowReader is the NullType::NULL, we should directly return it. What do you think?

Ah, my bad, I see the point here (when != UNKNOWN_PROP will return the value). Never mind.

critical27

LGTM now. Thanks a lot.

critical27 · 2021-11-25T06:34:52Z

Em.. the code format failed again. You could try to read the conduct

sworduo · 2021-11-25T07:19:50Z

Em.. the code format failed again. You could try to read the conduct

Thx, I will try it

yixinglu · 2021-11-26T05:12:50Z

Em.. the code format failed again. You could try to read the conduct

Thx, I will try it

hi @sworduo you can lookup the readme in the tests directory to fix the feature format lint error.

sworduo · 2021-11-29T13:12:46Z

Em.. the code format failed again. You could try to read the conduct

Thx, I will try it

hi @sworduo you can lookup the readme in the tests directory to fix the feature format lint error.

Sorry about the format error. I will fix it tomorrow.

sworduo · 2021-11-30T03:04:06Z

Em.. the code format failed again. You could try to read the conduct

Thx, I will try it

hi @sworduo you can lookup the readme in the tests directory to fix the feature format lint error.

Hi @yixinglu , when I run make fmt in tests directory. The result is

All done! 💥 💔 💥
136 files reformatted, 3 files left unchanged, 8 files failed to reformat.

A lot of files have been reformatted. And the change is like follows :

     When executing query:
-      """
-      CREATE TAG tag1()
-      """
+    """
+    CREATE TAG tag1()
+    """
     Then the execution should be successful
     # if not exists
     When executing query:
-      """
-      CREATE TAG IF NOT EXISTS tag1()
-      """
+    """
+    CREATE TAG IF NOT EXISTS tag1()
+    """

However, it is failed to reformat index.feature. The error is :

Error: cannot format /data/luoshangjun/source/nebula/tests/tck/features/index/Index.feature: INTERNAL ERROR: Invalid file contents are produced:
Parser errors:
(939:2): expected: #EOF, #TableRow, #StepLine, #TagLine, #ScenarioLine, #ScenarioOutlineLine, #Comment, #Empty, got '`name`(64)'
(940:1): expected: #EOF, #TableRow, #StepLine, #TagLine, #ScenarioLine, #ScenarioOutlineLine, #Comment, #Empty, got ')" |'
(938:7): inconsistent cell count within the table
(948:2): expected: #EOF, #TableRow, #StepLine, #TagLine, #ScenarioLine, #ScenarioOutlineLine, #Comment, #Empty, got '`age`'
(949:1): expected: #EOF, #TableRow, #StepLine, #TagLine, #ScenarioLine, #ScenarioOutlineLine, #Comment, #Empty, got ')" |'
(947:7): inconsistent cell count within the table
Please report a bug on https://github.com/ducminh-phan/reformat-gherkin/issues.
This invalid output might be helpful:
/tmp/rfmt-ghk_hr7pzs20.log

I also run make check-and-diff -C tests which produce the same results. I have no idea how to format it.

critical27 · 2021-12-02T04:13:44Z

You probably need to rebase onto latest master, which will works fine.

sworduo · 2021-12-02T09:03:04Z

You probably need to rebase onto latest master, which will works fine.

I have been rebased the latest master while produce the same results. Fortunately, formatting is successful in docker...

cangfengzhs · 2021-12-27T08:56:02Z

LGTM and please deal with the conflict.

sworduo · 2021-12-28T02:41:41Z

LGTM and please deal with the conflict.

That's OK. I have been resolved the conflict in my machine. However, I can not upgrade to third-party 3.0 to compile the code. The information is here #3462 (comment)

Sophie-Xie · 2021-12-28T08:36:03Z

LGTM and please deal with the conflict.

That's OK. I have been resolved the conflict in my machine. However, I can not upgrade to third-party 3.0 to compile the code. The information is here #3462 (comment)

Can you push your code, then try to compile it by CI of repo first. 😂

Sophie-Xie added the ready-for-testing PR: ready for the CI test label Nov 18, 2021

Sophie-Xie requested review from yixinglu and jievince November 19, 2021 01:05

critical27 reviewed Nov 22, 2021

View reviewed changes

critical27 added the community Source: who proposed the issue label Nov 22, 2021

critical27 reviewed Nov 25, 2021

View reviewed changes

critical27 previously approved these changes Nov 25, 2021

View reviewed changes

sworduo dismissed critical27’s stale review via 6be4108 November 25, 2021 09:42

Sophie-Xie linked an issue Nov 26, 2021 that may be closed by this pull request

rebuild tag index with old schema version value #3274

Closed

Sophie-Xie requested review from cangfengzhs and removed request for yixinglu and jievince December 21, 2021 09:32

Sophie-Xie added the ready for review label Dec 23, 2021

rebuild index for new prop

53cd4d2

Sophie-Xie added 2 commits December 28, 2021 19:15

Merge branch 'master' into rebuildIndexForNewProp

1a8dff0

Merge branch 'master' into rebuildIndexForNewProp

a52ff18

critical27 approved these changes Dec 28, 2021

View reviewed changes

Merge branch 'master' into rebuildIndexForNewProp

ace3d7c

yixinglu approved these changes Dec 28, 2021

View reviewed changes

yixinglu merged commit 4db974b into vesoft-inc:master Dec 28, 2021

sworduo deleted the rebuildIndexForNewProp branch December 29, 2021 02:01

jamieliu1023 mentioned this pull request Jan 1, 2022

Weekly Report 2021-12-31 vesoft-inc/nebula-community#83

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rebuild index for new prop #3332

rebuild index for new prop #3332

sworduo commented Nov 18, 2021 •

edited by SuperYoko

Loading

critical27 commented Nov 22, 2021

sworduo commented Nov 22, 2021

critical27 left a comment •

edited

Loading

sworduo commented Nov 22, 2021

sworduo commented Nov 22, 2021

critical27 commented Nov 22, 2021 •

edited

Loading

critical27 commented Nov 22, 2021 •

edited

Loading

sworduo commented Nov 22, 2021

critical27 left a comment

critical27 Nov 25, 2021

sworduo Nov 25, 2021

critical27 Nov 25, 2021

critical27 Nov 25, 2021 •

edited

Loading

critical27 left a comment •

edited

Loading

critical27 commented Nov 25, 2021

sworduo commented Nov 25, 2021

yixinglu commented Nov 26, 2021

sworduo commented Nov 29, 2021

sworduo commented Nov 30, 2021

critical27 commented Dec 2, 2021

sworduo commented Dec 2, 2021

cangfengzhs commented Dec 27, 2021

sworduo commented Dec 28, 2021

Sophie-Xie commented Dec 28, 2021

rebuild index for new prop #3332

rebuild index for new prop #3332

Conversation

sworduo commented Nov 18, 2021 • edited by SuperYoko Loading

What type of PR is this?

What does this PR do?

Which issue(s)/PR(s) this PR relates to?

Special notes for your reviewer, ex. impact of this fix, etc:

Additional context:

Checklist：

Release notes：

critical27 commented Nov 22, 2021

sworduo commented Nov 22, 2021

critical27 left a comment • edited Loading

Choose a reason for hiding this comment

sworduo commented Nov 22, 2021

sworduo commented Nov 22, 2021

critical27 commented Nov 22, 2021 • edited Loading

critical27 commented Nov 22, 2021 • edited Loading

sworduo commented Nov 22, 2021

critical27 left a comment

Choose a reason for hiding this comment

critical27 Nov 25, 2021

Choose a reason for hiding this comment

sworduo Nov 25, 2021

Choose a reason for hiding this comment

critical27 Nov 25, 2021

Choose a reason for hiding this comment

critical27 Nov 25, 2021 • edited Loading

Choose a reason for hiding this comment

critical27 left a comment • edited Loading

Choose a reason for hiding this comment

critical27 commented Nov 25, 2021

sworduo commented Nov 25, 2021

yixinglu commented Nov 26, 2021

sworduo commented Nov 29, 2021

sworduo commented Nov 30, 2021

critical27 commented Dec 2, 2021

sworduo commented Dec 2, 2021

cangfengzhs commented Dec 27, 2021

sworduo commented Dec 28, 2021

Sophie-Xie commented Dec 28, 2021

sworduo commented Nov 18, 2021 •

edited by SuperYoko

Loading

critical27 left a comment •

edited

Loading

critical27 commented Nov 22, 2021 •

edited

Loading

critical27 commented Nov 22, 2021 •

edited

Loading

critical27 Nov 25, 2021 •

edited

Loading

critical27 left a comment •

edited

Loading