Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/sem/tree: run TestEval with vectorize=experimental_on #40790

Merged
merged 1 commit into from
Sep 25, 2019
Merged

sql/sem/tree: run TestEval with vectorize=experimental_on #40790

merged 1 commit into from
Sep 25, 2019

Conversation

asubiotto
Copy link
Contributor

Release note: None

Release justification: Category 1 non-production code change to increase
test coverage of the vectorized execution engine.

Close #40635

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@asubiotto asubiotto requested review from jordanlewis and a team September 16, 2019 15:08
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rafiss)

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but unfortunately I'm fairly convinced (maybe 70% confident?) that this doesn't add much or any coverage to the vectorized engine. The reason is that all of the tests use constants only, but none of the vectorized engine's operators can operate on only constants - we just didn't make operators for those.

For example, if you had an eval test that said:

eval
2+2
----
4

This would attempt to make a query like select 2+2, I think. Since the tests explicitly disable constant folding, this will make a renderNode with the scalar expression 2+2, which will then... hm, I guess I don't know what it would do. Maybe it would actually work in the end - would we plan two "Datum" operators and add them all?

@asubiotto could you please check to see if any vectorized plans actually get created from the eval tests?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #40674 I noticed that adding two constant integers ended up being done in a projPlusInt64ConstInt64Op (there was one batch where every row just had a constant value), but I stopped working on that for now so I didn't figure out why.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool! Well in that case I guess I have no objections here.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@asubiotto
Copy link
Contributor Author

OK, I think I was a bit naive in assuming this would just work. I did a little more investigation to ensure this runs through the vectorized engine. The current integration test works by issuing the expression through QueryRow and comparing that result to the expected result. This does do constant folding (I got confused with a comment about constant folding that just serves to resolve the type of the QueryRow result). Therefore, enabling vectorized execution doesn't add any test coverage.

I'll think of something and update the PR accordingly.

@asubiotto
Copy link
Contributor Author

Updated to explicitly create a NewColOperator with the unmodified expression as the post process spec. The test currently fails with an interface conversion error which I have yet to investigate.

@asubiotto
Copy link
Contributor Author

Investigated the failing test but unsure how to fix this properly. The failing test is:

        --- FAIL: TestEval/vectorized/bit_array (0.00s)
            eval_test.go:53:
                testdata/eval/bit_array:176: B'11111111111111111111111110000101'::int4
                expected:
                -123

                found:
                unexpected error from the vectorized runtime: interface conversion: coldata.column is []int64, not []int32

This is caused because we type check the expression (which returns int32) to create the materializer, but the expression result comes back as an int64. What seems to happen here is that type information is getting lost when calling helper.Init in NewColOperator. The expression helper correctly type checks the expression type to an int32 here:

typedExpr, err := tree.TypeCheck(expr, semaCtx, types.Any)

But then returns the constant result (-123) cast to a tree.TypedExpr:
return expr.(tree.TypedExpr), nil

The vectorized flow creator then calls ResolvedType() on this expression, which returns an int64:
datumType := t.ResolvedType()

The root cause seems to be that we don't retain proper type information in processExpression because of the call to WalkExpr with a TypedExpr then cast to a TypedExpr:
// Pre-evaluate constant expressions. This is necessary to avoid repeatedly
// re-evaluating constant values every time the expression is applied.
//
// TODO(solon): It would be preferable to enhance our expression serialization
// format so this wouldn't be necessary.
c := tree.MakeConstantEvalVisitor(evalCtx)
expr, _ = tree.WalkExpr(&c, typedExpr)
if err := c.Err(); err != nil {
return nil, err
}
return expr.(tree.TypedExpr), nil

@solongordon, could you advise on the best way to fix this?

@asubiotto
Copy link
Contributor Author

Skipped that test as it is a limitation of our type system that the width gets lost when evaluating an expression. RFAL.

@asubiotto
Copy link
Contributor Author

Friendly ping

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This is pretty cool.

Release note: None

Release justification: Category 1 non-production code change to increase
test coverage of the vectorized execution engine.
@asubiotto
Copy link
Contributor Author

TFTR

bors r=jordanlewis,rafiss

craig bot pushed a commit that referenced this pull request Sep 25, 2019
40790: sql/sem/tree: run TestEval with vectorize=experimental_on r=jordanlewis,rafiss a=asubiotto

Release note: None

Release justification: Category 1 non-production code change to increase
test coverage of the vectorized execution engine.

Close #40635

Co-authored-by: Alfonso Subiotto Marqués <alfonso@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Sep 25, 2019

Build succeeded

@craig craig bot merged commit 798a757 into cockroachdb:master Sep 25, 2019
@asubiotto asubiotto deleted the veceval branch October 29, 2019 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exec: run all logic/eval tests that support vectorized with the vectorized engine
4 participants