Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix deserialize bug for nebula NULL data #74

Merged
merged 6 commits into from
Sep 28, 2022

Conversation

liuxiaocs7
Copy link
Contributor

try to fix #73

@codecov-commenter
Copy link

Codecov Report

Base: 61.54% // Head: 61.54% // No change to project coverage 👍

Coverage data is based on head (c757541) compared to base (6bb4a2e).
Patch coverage: 0.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff            @@
##             master      #74   +/-   ##
=========================================
  Coverage     61.54%   61.54%           
  Complexity      291      291           
=========================================
  Files            52       52           
  Lines          1784     1784           
  Branches        166      166           
=========================================
  Hits           1098     1098           
  Misses          596      596           
  Partials         90       90           
Impacted Files Coverage Δ
...connector/nebula/table/NebulaRowDataConverter.java 59.52% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@liuxiaocs7
Copy link
Contributor Author

liuxiaocs7 commented Sep 27, 2022

This link shows that the code just convert nebula data value to Long, when nebula data value is NULL, that indeed a question, but it should not be catched here because every field in insert edge is not null.

for (int pos = 0; pos < rowType.getFieldCount(); pos++) {
    ValueWrapper valueWrapper = values.get(pos);
    if (valueWrapper != null) {
        try {
            genericRowData.setField(pos,
                    toInternalConverters[pos].deserialize(valueWrapper));
        } catch (SQLException e) {
            e.printStackTrace();
        }
    } else {
        genericRowData.setField(pos, null);
    }
}

we should use valueWrapper.isNull() to judge null or not rather than != null.

After modify the error, the job results shows:

2> +I[61, 62, 1, aba, abcdefgh, null, 1111, 22222, 6412233, 2019-01-01, 2019-01-01T04:12:12, 435463424, false, 1.2, 1.0, 03:12:12, POINT(1.0 3.0)]

But insert statement are as follows:

INSERT EDGE `friend`(`col1`,`col2`,`col3`,`col4`,`col5`,`col6`,`col7`,`col8`,`col9`,`col10`,`col11`,`col12`,`col13`,`col14`) VALUES 61->62@0: ("aba","abcdefgh",22,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,false,1.2,1.0,time("11:12:12"),ST_GeogFromText("POINT(1 3)")),62->63@0: ("aba","abcdefgh",1,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,false,1.2,1.0,time("11:12:12"),ST_GeogFromText("POINT(1 3)")),63->64@0: ("aba","abcdefgh",1,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,false,1.2,1.0,time("11:12:12"),ST_GeogFromText("POINT(1 3)")),64->65@0: ("aba","abcdefgh",1,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,false,1.2,1.0,time("11:12:12"),ST_GeogFromText("LINESTRING(1 3,2 4)")),65->66@0: ("aba","abcdefgh",1,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,false,1.2,1.0,time("11:12:12"),ST_GeogFromText("LINESTRING(1 3,2 4)")),66->67@0: ("aba","abcdefgh",1,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,false,1.2,1.0,time("11:12:12"),ST_GeogFromText("LINESTRING(1 3,2 4)")),67->68@0: ("李四","abcdefgh",1,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,true,1.2,1.0,time("11:12:12"),ST_GeogFromText("polygon((0 1,1 2,2 3,0 1))")),68->61@0: ("aba","张三",1,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,true,1.2,1.0,time("11:12:12"),ST_GeogFromText("POLYGON((0 1,1 2,2 3,0 1))"))

relative edge:

61->62@0: ("aba","abcdefgh",22,1111,22222,6412233,date("2019-01-01"),datetime("2019-01-01T12:12:12"),435463424,false,1.2,1.0,time("11:12:12"),ST_GeogFromText("POINT(1 3)"))

So why only the int8 field is null

Although now it seems right after to differ graph space name(use flinkSinkInput), it is still confusing.

The most recently job result, and no extra null.
image

@liuxiaocs7
Copy link
Contributor Author

liuxiaocs7 commented Sep 27, 2022

In AbstractNebulaOutPutFormatITTest, now the logic skip rank.

image

for (int i = 2; i < columns.size(); i++) {
    if (config.get(RANK_ID_INDEX) != i) {
        positions.add(i);
        fields.add(columns.get(i).getName());
    }
}

@Nicole00
Copy link
Contributor

Nicole00 commented Sep 28, 2022

Hi, I have one question that what's the meaning of 1 after 61 and 62 in

2> +I[61, 62, 1, aba, abcdefgh, null, 1111, 22222, 6412233, 2019-01-01, 2019-01-01T04:12:12, 435463424, false, 1.2, 1.0, 03:12:12, POINT(1.0 3.0)]

it looks like the rank value ,but according to the relative edge data, the rank is 0 and there's no property value is 1.

@liuxiaocs7
Copy link
Contributor Author

Hi, I have one question that what's the meaning of 1 after 61 and 62 in

2> +I[61, 62, 1, aba, abcdefgh, null, 1111, 22222, 6412233, 2019-01-01, 2019-01-01T04:12:12, 435463424, false, 1.2, 1.0, 03:12:12, POINT(1.0 3.0)]

it looks like the rank value ,but according to the relative edge data, the rank is 0 and there's no property value is 1.

Sorry, I may not have been clear.

In newest implementation, it shows:

image

the rank id is 0, is consistent with what is written, code:

for (List<String> friend : friends) {
    edges.add(new NebulaEdge(
            friend.get(0), friend.get(1), 0L, friend.subList(2, friend.size())));
}

The previous problem was because AbstractNebulaInputFormatITTest and AbstractNebulaOutputFormatITTest use same graph name flinkSink, in AbstractNebulaOutputFormatITTest's logic, it ignores col3 and regard it as rank id.

image

But why is it possible to insert an edge where the vertex does not exist? In this test file

insert into person values ('89', 'aba', 'abcdefgh', '1', '1111',"
                                + " '22222', '6412233', '2019-01-01', '2019-01-01T12:12:12',"
                                + " '435463424', 'false', '1.2', '1.0', '11:12:12', 'POINT(1 3)')")

insert into friend values ('61', '62', 'aba', 'abcdefgh',"
                                + " '1', '1111', '22222', '6412233', '2019-01-01',"
                                + " '2019-01-01T12:12:12',"
                                + " '435463424', 'false', '1.2', '1.0', '11:12:12', 'POINT(1 3)')")

insert into friend values ('61', '89', 'aba', 'abcdefgh',"
        + " '1', '1111', '22222', '6412233', '2019-01-01',"
        + " '2019-01-01T12:12:12', '435463424', 'false', '1.2', '1.0',"
        + " '11:12:12', 'POINT(1 3)')")

@Nicole00
Copy link
Contributor

I see, thanks for your explanation.

why is it possible to insert an edge where the vertex does not exist?
It's because NebulaGraph allows hanging edge, when there's no vertex, the edge still can be insert successfully.

Copy link
Contributor

@Nicole00 Nicole00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your work~

@Nicole00 Nicole00 merged commit 5243fb7 into vesoft-inc:master Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

deserialize bug for nebula NULL data
3 participants