-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] Remove corr in group-by.sql and separate supported SQLQueryTest list for backends #3774
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
76cf39c
to
050069e
Compare
Run Gluten Clickhouse CI |
3 similar comments
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
5388ff4
to
69ffdff
Compare
Run Gluten Clickhouse CI |
69ffdff
to
92aa671
Compare
Run Gluten Clickhouse CI |
@@ -38,4 +38,6 @@ trait Backend { | |||
def broadcastApi(): BroadcastApi | |||
|
|||
def settings(): BackendSettingsApi | |||
|
|||
def testApi(): TestApi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @marin-ma, as this interface is just for test use, can we move it to some place in test module? Maybe, into BackendTestSettings
? Thanks!
ignoreList.exists( | ||
t => testCase.name.toLowerCase(Locale.ROOT).contains(t.toLowerCase(Locale.ROOT))) | ||
t => testCase.name.toLowerCase(Locale.ROOT).equals(t.toLowerCase(Locale.ROOT))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, I suppose we wil be missing some sql files under subdirectory. Can we specially handle group-by.sql only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using contains
can lead to some unexpected files being added. For example, group-by.sql
was in the supported list but udf/udf-group-by.sql
wasn't. But the latter one was also tested. If we are sure that one file is supported, it should be explicitly added to the supported list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's just what I mean. With this change, we may be missing some tests for the files under subdirectory. There are some other sql files under subdirectory but share the same name with those in root directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep using contains
and exclude udf/udf-group-by.sql
? If there are more test cases in udf/udf-group-by.sql
, maybe we can merge them into the group-by.sql
added by ourselves.
This can make sure other tests are not missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intended that we use contains
to make udf/udf-group-by.sql
a test? I think it would be better to explicitly add it to the supported list. Moreover, the current supported list inconsistently mixes some files specified with subdirectories and others with only the filename. It would be confusing for others who are trying to add new files to the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way to use contains
derives from Spark SQLQueryTestSuite. It makes sense to me if we need to change it in Gluten test.
} | ||
} | ||
|
||
private val SUPPORTED_SQL_QUERY_LIST_SPARK32: Set[String] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rui-mo Do you mean some of the filenames listed here might not specifying the subdirectory? If so, we should add the missing subdirectories.
Run Gluten Clickhouse CI |
@rui-mo I would suggest that I perform a final check to ensure that all files belonging to any subdirectories are specified correctly, and to make sure no files are missing. And we use |
@marin-ma I think it makes sense as long as we don't miss any test. |
59e7bdc
to
eb22fde
Compare
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
val overwriteTestCaseNames = overwriteTestCases.map(_.name) | ||
listFilesRecursively(new File(inputFilePath)) | ||
.flatMap(createTestCase(_, inputFilePath, goldenFilePath)) | ||
.filterNot(testCase => overwriteTestCaseNames.contains(testCase.name)) ++ overwriteTestCases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the thoughts here to use contains instead of equals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overwriteTestCaseNames is Seq[String]
, here it checks whether the testCase.name
is in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I believe commit oap-project/velox@64b00b0 could be removed after this PR. cc @JkSelf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Yes. The test PR to remove that commit also passed CI. #3783 |
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
Velox
corr
has better computation logic but it fails Spark's precision check. Based on the discussion in facebookincubator/velox#7204, we decide to use original Veloxcorr
and mute the precision check failure in Gluten UT.This PR contains following changes:
TestApi
.inputs/group-by.sql
,input/udf/udf-group-by.sql
andresults/group-by.sql.out
,results/udf/udf-group-by.sql.out
and removed the query contains "corr".equals
instead ofcontains
.