Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range query for custom attribute #5426

Merged
merged 97 commits into from
Nov 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
f771131
Add pinot dual visibility manager and new advance visibility option
neil-xie Mar 22, 2023
9b86598
update visibility store and implemented unit test
bowenxia Mar 23, 2023
360ee20
update: go mod tidy
bowenxia Mar 23, 2023
3f074de
update go.sum
bowenxia Mar 23, 2023
218e829
Fix the start options for pinot
neil-xie Mar 23, 2023
fa587cb
update unit test
bowenxia Mar 23, 2023
bb4fa67
run make cmds to update files
bowenxia Mar 23, 2023
c056a9f
update one unit test case that had errors
bowenxia Mar 23, 2023
10b4598
fix unit test
bowenxia Mar 23, 2023
7421c35
Fix pinot config file and add ES config to make it run with workers a…
neil-xie Mar 24, 2023
931f364
add ES into docker compose file
bowenxia Mar 24, 2023
624a769
Fix kafka producer message
neil-xie Mar 28, 2023
9130637
revert kafka change, update producer message struct to match with schema
neil-xie Mar 28, 2023
909ac6f
cleanup
neil-xie Mar 28, 2023
d623f8f
change tableName and places used it accordingly, to allow pinot to re…
bowenxia Mar 28, 2023
8ac32e8
add a pinotClientInterface, refactor unit test
bowenxia Mar 30, 2023
2370332
NewPinotConnectionClient should return a genetic value
bowenxia Mar 30, 2023
94cb588
fix a failed unit test & add mock genericClient component
bowenxia Mar 30, 2023
ed103a5
rename pinotConnectionClient tobe pinotClient
bowenxia Mar 30, 2023
ab15272
change PinotClient to be public
bowenxia Mar 31, 2023
d749c1a
Update visibility manager to use the new pinot generic client
neil-xie Mar 31, 2023
6191af5
Fix the log format
neil-xie Mar 31, 2023
9f7b620
update kafka config to separate kafka topics for pinot and ES, add pi…
neil-xie Apr 4, 2023
8aade6a
Fix typo in pinot table config
neil-xie Apr 5, 2023
52e993d
change json tags to start with upper case to match pinot schema
bowenxia Apr 5, 2023
97b5763
Fix run ID json tag and remove unused kafka key from schema
neil-xie Apr 5, 2023
a3c05cc
fix a naming issue that cause pinot can't receive CloseTime
bowenxia Apr 5, 2023
3ff2029
change json tag so that closeStatus won't be ignored when it is 0
bowenxia Apr 5, 2023
ba1879d
add attr into pinot
bowenxia Apr 5, 2023
6c3dd9c
update decoded attr
bowenxia Apr 6, 2023
45ccf60
Update reading from pinot dynamic config
neil-xie Apr 6, 2023
dc93318
add single quote to query
bowenxia Apr 6, 2023
c71355d
correct order Query formations
bowenxia Apr 6, 2023
b73ad5f
Add log for debug purpose
neil-xie Apr 6, 2023
0b1d5a2
fix can't unmarshal request.Attr error
bowenxia Apr 6, 2023
9407591
fix nil pointer in isRecordValid
bowenxia Apr 6, 2023
067e71c
clean up
bowenxia Apr 7, 2023
31cbb5b
Remove unnecessary debug info and unused message fields
neil-xie Apr 7, 2023
64ff7bb
solve can't unmarshal attr issue
bowenxia Apr 7, 2023
66db061
update unit test to pass
bowenxia Apr 11, 2023
8f7dcaf
Get pinot table from config
neil-xie Apr 14, 2023
543810b
Update table name in config
neil-xie Apr 14, 2023
c95251c
use table name from config
bowenxia Apr 14, 2023
d1fbe11
clean up
bowenxia Apr 15, 2023
2fa02a3
update unit test
bowenxia Apr 15, 2023
376251d
change couple types in pinot message
bowenxia Apr 20, 2023
9fda9f3
update pinot visibility triple manager to write to pinot and ES (#5229)
neil-xie Apr 25, 2023
c0c327c
Cdnc 4574 (#5230)
bowenxia Apr 26, 2023
64748d6
Add pagination and flatten customized search attributes (#5234)
bowenxia May 8, 2023
5751cd1
Adds Dynamic-config type (#5261)
davidporter-id-au May 9, 2023
3978502
clean up: delete one unused function, and one line refactor
bowenxia May 16, 2023
7ade724
update config file for deleting Attr
bowenxia May 17, 2023
5a2c271
fix a nil pointer after removing Attr
bowenxia May 17, 2023
378597b
update a test case to cover multiple order by clause case
bowenxia May 17, 2023
6bc3f61
Fix fmt
neil-xie May 19, 2023
f5b7734
Add pinot integration test (#5316)
neil-xie Jun 7, 2023
bac8352
Cdnc 4589 (#5318)
bowenxia Jun 8, 2023
97bc5a4
Update Pinot query to order by closetime when query closed wf, order …
neil-xie Jun 22, 2023
af28635
refactor pinotClient to pass in pinotConfig
bowenxia Jun 22, 2023
0d527a9
Revert "refactor pinotClient to pass in pinotConfig"
bowenxia Jun 22, 2023
16c37ea
refactor pinotClient to pass in pinotConfig
bowenxia Jun 22, 2023
3bb3010
PinotQueryValidator (#5333)
bowenxia Jun 30, 2023
76247ac
Add limit clause to pinot queries (#5337)
bowenxia Jul 5, 2023
79c83e5
Update all queries to order by startTime
neil-xie Jul 11, 2023
b695a18
Adding a PInot/ES response comparator (#5353)
bowenxia Aug 3, 2023
763e378
Fix rebase and lint
neil-xie Aug 3, 2023
fa08342
Fix integration test and minor clean up
neil-xie Aug 9, 2023
2db04d3
more clean up
neil-xie Aug 14, 2023
9afb738
Add more comments and more clean up
neil-xie Aug 15, 2023
1adf907
Update to use constants for visibility store name instead of strings
neil-xie Aug 24, 2023
91adba5
Enable json index (#5390)
bowenxia Sep 28, 2023
3d1ac46
Rebase
neil-xie Sep 28, 2023
8d9da02
Uncomment code that caused error by idl changes
neil-xie Sep 28, 2023
fca6bba
More clean up
neil-xie Sep 28, 2023
ac9c769
turn off comparator
bowenxia Oct 2, 2023
eae9ad4
Add pinot metrics client and update pinot visibility manager to use i…
neil-xie Oct 3, 2023
c49ea32
Update read and write mode to prepare for migration
neil-xie Oct 9, 2023
775a31e
Add SecondsSinceEpoch field and update Pinot schema (#5418)
neil-xie Oct 12, 2023
b4e2c66
Address comments part 1
neil-xie Oct 12, 2023
295a989
Merge branch 'master' into CDNC_4431
shijiesheng Oct 16, 2023
7b68dd3
Address comments and fix Pinot integration test
neil-xie Oct 17, 2023
ed8dbe2
remove temporarily to rename folder
neil-xie Oct 17, 2023
02a0812
Add back with new folder name
neil-xie Oct 17, 2023
593483b
Fix
neil-xie Oct 17, 2023
2cbfa21
Minor fix for stopwatch
neil-xie Oct 18, 2023
2c84916
Merge branch 'master' into CDNC_4431
neil-xie Oct 18, 2023
cce115e
Add more comments
neil-xie Oct 19, 2023
f6917e8
Merge branch 'master' into CDNC_4431
neil-xie Oct 19, 2023
5b98b5b
add range query and unit test
bowenxia Oct 19, 2023
45031c5
Merge remote-tracking branch 'origin/CDNC_4431' into CDNC_5946_RangeQ…
bowenxia Oct 19, 2023
21d1dda
Merge branch 'master' into CDNC_5946_RangeQuery
bowenxia Oct 19, 2023
0d2f41a
support <, >, >=, <= for custom attributes
bowenxia Oct 19, 2023
49fe217
Merge branch 'CDNC_5946_RangeQuery' of github.com:uber/cadence into C…
bowenxia Oct 19, 2023
41a8345
remove dead code
bowenxia Oct 19, 2023
94c308e
add unit tests
bowenxia Oct 19, 2023
2bf6418
add comment for range query function
bowenxia Oct 20, 2023
5e3bc5c
Merge branch 'master' into CDNC_5946_RangeQuery
bowenxia Nov 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions common/persistence/pinot/pinotVisibilityStore_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ LIMIT 0, 10
FROM %s
WHERE DomainID = 'bfd5c907-f899-4baf-a7b2-2ab85e623ebd'
AND IsDeleted = false
AND WorkflowID = 'wid' and ((JSON_MATCH(Attr, '"$.CustomStringField" is not null') AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.CustomStringField', 'string'), 'custom and custom2 or custom3 order by*')) or CustomIntField between 1 and 10)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously we didn't find the range query was not working since we used the old pinot table which we had flatten attribute. So the range query "between and" worked.

AND WorkflowID = 'wid' and ((JSON_MATCH(Attr, '"$.CustomStringField" is not null') AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.CustomStringField', 'string'), 'custom and custom2 or custom3 order by*')) or (JSON_MATCH(Attr, '"$.CustomIntField" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) >= 1 AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) <= 10))
Order BY StartTime DESC
LIMIT 0, 10
`, testTableName),
Expand Down Expand Up @@ -239,7 +239,7 @@ LIMIT 0, 10
FROM %s
WHERE DomainID = 'bfd5c907-f899-4baf-a7b2-2ab85e623ebd'
AND IsDeleted = false
AND CloseStatus < 0 and (JSON_MATCH(Attr, '"$.CustomKeywordField"=''keywordCustomized''') or JSON_MATCH(Attr, '"$.CustomKeywordField[*]"=''keywordCustomized''')) and JSON_MATCH(Attr, '"$.CustomIntField"=''10''') and (JSON_MATCH(Attr, '"$.CustomStringField" is not null') AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.CustomStringField', 'string'), 'String field is for text*'))
AND CloseStatus < 0 and (JSON_MATCH(Attr, '"$.CustomKeywordField"=''keywordCustomized''') or JSON_MATCH(Attr, '"$.CustomKeywordField[*]"=''keywordCustomized''')) and (JSON_MATCH(Attr, '"$.CustomIntField" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) <= 10) and (JSON_MATCH(Attr, '"$.CustomStringField" is not null') AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.CustomStringField', 'string'), 'String field is for text*'))
Order by DomainID Desc
LIMIT 11, 10
`, testTableName),
Expand Down
104 changes: 81 additions & 23 deletions common/pinot/pinotQueryValidator.go
Original file line number Diff line number Diff line change
Expand Up @@ -91,24 +91,58 @@ func (qv *VisibilityQueryValidator) validateWhereExpr(expr sqlparser.Expr) (stri
if expr == nil {
return "", nil
}
buf := sqlparser.NewTrackedBuffer(nil)

switch expr := expr.(type) {
case *sqlparser.AndExpr, *sqlparser.OrExpr:
return qv.validateAndOrExpr(expr)
case *sqlparser.ComparisonExpr:
return qv.validateComparisonExpr(expr)
case *sqlparser.RangeCond:
expr.Format(buf)
return buf.String(), nil
//return qv.validateRangeExpr(expr)
return qv.validateRangeExpr(expr)
case *sqlparser.ParenExpr:
return qv.validateWhereExpr(expr.Expr)
default:
return "", errors.New("invalid where clause")
}
}

// for "between...and..." only
// <, >, >=, <= are included in validateComparisonExpr()
func (qv *VisibilityQueryValidator) validateRangeExpr(expr sqlparser.Expr) (string, error) {
buf := sqlparser.NewTrackedBuffer(nil)
rangeCond := expr.(*sqlparser.RangeCond)
colName, ok := rangeCond.Left.(*sqlparser.ColName)
if !ok {
return "", errors.New("invalid range expression: fail to get colname")
}
colNameStr := colName.Name.String()

if !qv.isValidSearchAttributes(colNameStr) {
return "", fmt.Errorf("invalid search attribute %q", colNameStr)
}

if definition.IsSystemIndexedKey(colNameStr) {
expr.Format(buf)
return buf.String(), nil
}

//lowerBound, ok := rangeCond.From.(*sqlparser.ColName)
lowerBound, ok := rangeCond.From.(*sqlparser.SQLVal)
if !ok {
return "", errors.New("invalid range expression: fail to get lowerbound")
}
lowerBoundString := string(lowerBound.Val)

upperBound, ok := rangeCond.To.(*sqlparser.SQLVal)
if !ok {
return "", errors.New("invalid range expression: fail to get upperbound")
}
upperBoundString := string(upperBound.Val)

return fmt.Sprintf("(JSON_MATCH(Attr, '\"$.%s\" is not null') "+
"AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.%s') AS INT) >= %s "+
"AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.%s') AS INT) <= %s)", colNameStr, colNameStr, lowerBoundString, colNameStr, upperBoundString), nil
}

func (qv *VisibilityQueryValidator) validateAndOrExpr(expr sqlparser.Expr) (string, error) {
var leftExpr sqlparser.Expr
var rightExpr sqlparser.Expr
Expand Down Expand Up @@ -248,24 +282,48 @@ func (qv *VisibilityQueryValidator) processCustomKey(expr sqlparser.Expr) (strin
// get the value type
indexValType := common.ConvertIndexedValueTypeToInternalType(valType, log.NewNoop())

// Case2-1: when it is string, need partial match
if indexValType == types.IndexedValueTypeString {
// change to like statement for partial match
comparisonExpr.Operator = sqlparser.LikeStr
comparisonExpr.Right = &sqlparser.SQLVal{
Type: sqlparser.StrVal,
Val: []byte("%" + colValStr + "%"),
}
//return fmt.Sprintf("JSON_EXTRACT_SCALAR(Attr, '$.%s', 'STRING') LIKE '%%%s%%'", colNameStr, colValStr), nil
return fmt.Sprintf("(JSON_MATCH(Attr, '\"$.%s\" is not null') "+
"AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.%s', 'string'), '%s*'))", colNameStr, colNameStr, colValStr), nil
operator := comparisonExpr.Operator

switch indexValType {
case types.IndexedValueTypeString:
return processCustomString(comparisonExpr, colNameStr, colValStr), nil
case types.IndexedValueTypeKeyword:
return processCustomKeyword(operator, colNameStr, colValStr), nil
case types.IndexedValueTypeDatetime:
return processCustomNum(operator, colNameStr, colValStr, "BIGINT"), nil
case types.IndexedValueTypeDouble:
return processCustomNum(operator, colNameStr, colValStr, "DOUBLE"), nil
case types.IndexedValueTypeInt:
return processCustomNum(operator, colNameStr, colValStr, "INT"), nil
default:
return processEqual(colNameStr, colValStr), nil
}
// case2-2: otherwise, exact match
// case2-2-1: if it is keyword, need to deal with a situation when value is an array
if indexValType == types.IndexedValueTypeKeyword {
return fmt.Sprintf("(JSON_MATCH(Attr, '\"$.%s\"=''%s''') or JSON_MATCH(Attr, '\"$.%s[*]\"=''%s'''))",
colNameStr, colValStr, colNameStr, colValStr), nil
}

func processCustomNum(operator string, colNameStr string, colValStr string, valType string) string {
if operator == sqlparser.EqualStr {
return processEqual(colNameStr, colValStr)
}
return fmt.Sprintf("(JSON_MATCH(Attr, '\"$.%s\" is not null') "+
"AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.%s') AS %s) %s %s)", colNameStr, colNameStr, valType, operator, colValStr)
}

func processEqual(colNameStr string, colValStr string) string {
return fmt.Sprintf("JSON_MATCH(Attr, '\"$.%s\"=''%s''')", colNameStr, colValStr)
}

func processCustomKeyword(operator string, colNameStr string, colValStr string) string {
return fmt.Sprintf("(JSON_MATCH(Attr, '\"$.%s\"%s''%s''') or JSON_MATCH(Attr, '\"$.%s[*]\"%s''%s'''))",
colNameStr, operator, colValStr, colNameStr, operator, colValStr)
}

func processCustomString(comparisonExpr *sqlparser.ComparisonExpr, colNameStr string, colValStr string) string {
// change to like statement for partial match
comparisonExpr.Operator = sqlparser.LikeStr
comparisonExpr.Right = &sqlparser.SQLVal{
Type: sqlparser.StrVal,
Val: []byte("%" + colValStr + "%"),
}
// case2-2-2: other cases:
return fmt.Sprintf("JSON_MATCH(Attr, '\"$.%s\"=''%s''')", colNameStr, colValStr), nil
return fmt.Sprintf("(JSON_MATCH(Attr, '\"$.%s\" is not null') "+
"AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.%s', 'string'), '%s*'))", colNameStr, colNameStr, colValStr)
}
41 changes: 39 additions & 2 deletions common/pinot/pinotQueryValidator_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ func TestValidateQuery(t *testing.T) {
},
"Case6-1: complex query I: with parenthesis": {
query: "(CustomStringField = 'custom and custom2 or custom3 order by') or CustomIntField between 1 and 10",
validated: "((JSON_MATCH(Attr, '\"$.CustomStringField\" is not null') AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.CustomStringField', 'string'), 'custom and custom2 or custom3 order by*')) or CustomIntField between 1 and 10)",
validated: "((JSON_MATCH(Attr, '\"$.CustomStringField\" is not null') AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.CustomStringField', 'string'), 'custom and custom2 or custom3 order by*')) or (JSON_MATCH(Attr, '\"$.CustomIntField\" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) >= 1 AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) <= 10))",
},
"Case6-2: complex query II: with only system keys": {
query: "DomainID = 'd-id' and (RunID = 'run-id' or WorkflowID = 'wid')",
Expand All @@ -71,7 +71,7 @@ func TestValidateQuery(t *testing.T) {
},
"Case6-4: complex query IV": {
query: "WorkflowID = 'wid' and (CustomStringField = 'custom and custom2 or custom3 order by' or CustomIntField between 1 and 10)",
validated: "WorkflowID = 'wid' and ((JSON_MATCH(Attr, '\"$.CustomStringField\" is not null') AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.CustomStringField', 'string'), 'custom and custom2 or custom3 order by*')) or CustomIntField between 1 and 10)",
validated: "WorkflowID = 'wid' and ((JSON_MATCH(Attr, '\"$.CustomStringField\" is not null') AND REGEXP_LIKE(JSON_EXTRACT_SCALAR(Attr, '$.CustomStringField', 'string'), 'custom and custom2 or custom3 order by*')) or (JSON_MATCH(Attr, '\"$.CustomIntField\" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) >= 1 AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) <= 10))",
},
"Case7: invalid sql query": {
query: "Invalid SQL",
Expand Down Expand Up @@ -113,6 +113,43 @@ func TestValidateQuery(t *testing.T) {
query: "CustomIntField = 1 or CustomIntField = 2",
validated: "(JSON_MATCH(Attr, '\"$.CustomIntField\"=''1''') or JSON_MATCH(Attr, '\"$.CustomIntField\"=''2'''))",
},
"Case14-1: range query: custom filed": {
query: "CustomIntField BETWEEN 1 AND 2",
validated: "(JSON_MATCH(Attr, '\"$.CustomIntField\" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) >= 1 AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) <= 2)",
},
"Case14-2: range query: system filed": {
query: "NumClusters BETWEEN 1 AND 2",
validated: "NumClusters between 1 and 2",
},
"Case15-1: custom date attribute less than": {
query: "CustomDatetimeField < 1697754674",
validated: "(JSON_MATCH(Attr, '\"$.CustomDatetimeField\" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomDatetimeField') AS BIGINT) < 1697754674)",
},
"Case15-2: custom date attribute greater than or equal to": {
query: "CustomDatetimeField >= 1697754674",
validated: "(JSON_MATCH(Attr, '\"$.CustomDatetimeField\" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomDatetimeField') AS BIGINT) >= 1697754674)",
},
"Case15-3: system date attribute greater than or equal to": {
query: "StartTime >= 1697754674",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When user input a date string like StartTime > "2023-11-02T15:36:47", is this handled in the query validator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not handled here. But that will be converted to UnixMilli() in PiontVisibilityStore.

validated: "StartTime >= 1697754674",
},
"Case16-1: custom int attribute greater than or equal to": {
query: "CustomIntField >= 0",
validated: "(JSON_MATCH(Attr, '\"$.CustomIntField\" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomIntField') AS INT) >= 0)",
},
"Case16-2: custom double attribute greater than or equal to": {
query: "CustomDoubleField >= 0",
validated: "(JSON_MATCH(Attr, '\"$.CustomDoubleField\" is not null') AND CAST(JSON_EXTRACT_SCALAR(Attr, '$.CustomDoubleField') AS DOUBLE) >= 0)",
},
"Case17: custom keyword attribute greater than or equal to. Will return error run time": {
query: "CustomKeywordField < 0",
validated: "(JSON_MATCH(Attr, '\"$.CustomKeywordField\"<''0''') or JSON_MATCH(Attr, '\"$.CustomKeywordField[*]\"<''0'''))",
},
// TODO
"Case18: custom int order by. Will have errors at run time. Doesn't support for now": {
query: "CustomIntField = 0 order by CustomIntField desc",
validated: "JSON_MATCH(Attr, '\"$.CustomIntField\"=''0''') order by CustomIntField desc",
},
}

for name, test := range tests {
Expand Down