Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add In Clause handling in json indexed col (Attr) #6147

Merged
merged 3 commits into from
Jun 26, 2024

Conversation

bowenxia
Copy link
Contributor

@bowenxia bowenxia commented Jun 25, 2024

What changed?
Added IN Clause handling for json indexed col (Attr)
And unit test with 100% coverage of the new function.

Why?
One customer complained that they had error when using IN Clause after we migrated their domain to Pinot. Because previously ES supported that.
Pinot DOES supported this for all other system keys; but in a json indexed col like ATTR (we stored this column in a json format), we need to handle IN clause in a different way.

Sample query:
if we input:
BinaryChecksums IN ("uDeploy:e6b658fc4ae98445a356d2218316081f7113fcdf")
It would be processed to:
and JSON_MATCH(Attr, '"$.BinaryChecksums" IN (''uDeploy:e6b658fc4ae98445a356d2218316081f7113fcdf'')') or JSON_MATCH(Attr, '"$.BinaryChecksums[*]" IN (''uDeploy:e6b658fc4ae98445a356d2218316081f7113fcdf'')')

How did you test it?
unit test

Potential risks

Release notes

Documentation Changes

@bowenxia bowenxia changed the title add In Clause handling in json indexed col (Attr) Add In Clause handling in json indexed col (Attr) Jun 25, 2024
@@ -71,7 +71,7 @@ func (qv *VisibilityQueryValidator) ValidateQuery(whereClause string) (string, e

stmt, err := sqlparser.Parse(placeholderQuery)
if err != nil {
return "", &types.BadRequestError{Message: "Invalid query."}
return "", &types.BadRequestError{Message: "Invalid query." + err.Error()}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Print error message here for a better debugging experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if it is a good idea since this error will be propagated to the customers. Do we want to expose which DB or which tables we use for customers? We can log it for debugging purposes while exposing only a generic message to the client.
Or, we can extract details about which arguments in the query are invalid to help us understand what is wrong. See *json.UnmarshalTypeError is an example that can point to the exact failure while not exposing the whole thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually a sql expression parsing error.
It will have something like:
"Invalid query.syntax error at position 53 near 'select'"
"Invalid query.syntax error at position 38 near 'sql'"
Doesn't expose DB details or tables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a space after the dot or change it to colon. query.syntax is a bit confusing

Copy link

codecov bot commented Jun 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.72%. Comparing base (0b46176) to head (71af3ac).

Current head 71af3ac differs from pull request most recent head 46d3ac5

Please upload reports for the commit 46d3ac5 to get more accurate results.

Additional details and impacted files
Files Coverage Δ
common/pinot/pinotQueryValidator.go 86.02% <100.00%> (+1.13%) ⬆️

... and 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b46176...46d3ac5. Read the comment docs.

@coveralls
Copy link

coveralls commented Jun 26, 2024

Pull Request Test Coverage Report for Build 019051d7-b549-4c26-b296-87c3bc87a437

Details

  • 28 of 28 (100.0%) changed or added relevant lines in 1 file are covered.
  • 34 unchanged lines in 11 files lost coverage.
  • Overall coverage increased (+0.01%) to 71.553%

Files with Coverage Reduction New Missed Lines %
service/matching/tasklist/matcher.go 1 90.55%
common/task/weighted_round_robin_task_scheduler.go 2 89.05%
common/peerprovider/ringpopprovider/config.go 2 81.58%
common/util.go 2 91.84%
service/history/task/task.go 3 84.81%
common/persistence/nosql/nosql_task_store.go 3 85.52%
service/history/handler/handler.go 3 96.2%
service/history/task/timer_standby_task_executor.go 3 85.63%
service/history/task/fetcher.go 3 85.57%
service/history/task/transfer_standby_task_executor.go 6 86.33%
Totals Coverage Status
Change from base Build 0190508e-3f7e-4c0d-b546-4eaa2a2045f1: 0.01%
Covered Lines: 107119
Relevant Lines: 149706

💛 - Coveralls

values[i] = "''" + string(sqlVal.Val) + "''"
}

return fmt.Sprintf("JSON_MATCH(Attr, '\"$.%s\" IN (%s)') or JSON_MATCH(Attr, '\"$.%s[*]\" IN (%s)')",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query is amazing, I got help from gpt to understand :)

Copy link
Member

@neil-xie neil-xie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with nit comments

@bowenxia bowenxia enabled auto-merge (squash) June 26, 2024 21:38
@bowenxia bowenxia merged commit c7f6233 into master Jun 26, 2024
20 checks passed
@bowenxia bowenxia deleted the xbowen_IN_Clause_in_Attr branch June 26, 2024 22:10
@coveralls
Copy link

coveralls commented Jun 26, 2024

Pull Request Test Coverage Report for Build 0190567b-a424-4d32-a57e-d52b447b9cb0

Details

  • 28 of 28 (100.0%) changed or added relevant lines in 1 file are covered.
  • 34 unchanged lines in 9 files lost coverage.
  • Overall coverage increased (+0.004%) to 71.547%

Files with Coverage Reduction New Missed Lines %
common/task/weighted_round_robin_task_scheduler.go 2 89.05%
service/matching/tasklist/db.go 2 73.23%
common/util.go 2 91.84%
common/log/tag/tags.go 3 50.46%
common/persistence/nosql/nosql_task_store.go 3 85.52%
service/history/task/timer_standby_task_executor.go 3 85.63%
common/task/fifo_task_scheduler.go 4 83.51%
service/history/task/transfer_standby_task_executor.go 6 86.94%
service/matching/tasklist/task_reader.go 9 75.33%
Totals Coverage Status
Change from base Build 0190508e-3f7e-4c0d-b546-4eaa2a2045f1: 0.004%
Covered Lines: 107110
Relevant Lines: 149706

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants