Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MatchPhraseQuery As Alternate Syntax for Match_Phrase Function #165

Conversation

GabeFernandez310
Copy link

@GabeFernandez310 GabeFernandez310 commented Nov 14, 2022

Description

Adds matchphrasequery as alternate syntax for match_phrase function which currently exists in opensearch

Queries can be performed using the following syntax :

matchphrasequery(field, 'query')

Example:

SELECT field FROM index WHERE matchphrasequery(field, 'query');

This query should return the same result as match_phrase(field, 'query')

Issues Resolved

AOS-765

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov
Copy link

codecov bot commented Nov 14, 2022

Codecov Report

Merging #165 (fbcee86) into integ-add-legacy-syntax-for-matchphrase-function (e280866) will decrease coverage by 35.52%.
The diff coverage is n/a.

@@                                   Coverage Diff                                   @@
##             integ-add-legacy-syntax-for-matchphrase-function     #165       +/-   ##
=======================================================================================
- Coverage                                               98.28%   62.76%   -35.53%     
=======================================================================================
  Files                                                     345       10      -335     
  Lines                                                    8580      658     -7922     
  Branches                                                  547      119      -428     
=======================================================================================
- Hits                                                     8433      413     -8020     
- Misses                                                    142      192       +50     
- Partials                                                    5       53       +48     
Flag Coverage Δ
query-workbench 62.76% <ø> (?)
sql-engine ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...h/sql/expression/function/BuiltinFunctionName.java
...h/sql/expression/function/OpenSearchFunctions.java
...arch/storage/script/filter/FilterQueryBuilder.java
...l/response/format/SimpleJsonResponseFormatter.java
...pensearch/sql/protocol/response/format/Format.java
...s/request/system/PrometheusListMetricsRequest.java
...ript/filter/lucene/relevance/QueryStringQuery.java
...a/org/opensearch/sql/utils/DateTimeFormatters.java
.../org/opensearch/sql/ppl/utils/ArgumentFactory.java
...a/org/opensearch/sql/data/model/ExprNullValue.java
... and 345 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

docs/user/dql/functions.rst Outdated Show resolved Hide resolved
@@ -426,7 +426,8 @@ systemFunctionName

singleFieldRelevanceFunctionName
: MATCH | MATCH_PHRASE | MATCHPHRASE
| MATCH_BOOL_PREFIX | MATCH_PHRASE_PREFIX
| MATCHPHRASEQUERY | MATCH_BOOL_PREFIX

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add it to the previous line with MATCH_PHRASE | MATCHPHRASE

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d04916b

Copy link

@MaxKsyunz MaxKsyunz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure SQL Java CI workflow passes.
Looks to be failing on checkstyle.

@GabeFernandez310 GabeFernandez310 marked this pull request as ready for review November 15, 2022 15:07
@GabeFernandez310
Copy link
Author

Please make sure SQL Java CI workflow passes. Looks to be failing on checkstyle.

Fixed!

@GabeFernandez310 GabeFernandez310 requested review from a team, forestmvey, GumpacG, MaxKsyunz, MitchellGale, margarit-h and Yury-Fridlyand and removed request for a team November 15, 2022 23:15
JSONObject result1 = executeJdbcRequest(String.format(query1, TEST_INDEX_PHRASE));
JSONObject result2 = executeJdbcRequest(String.format(query2, TEST_INDEX_PHRASE));
JSONObject result3 = executeJdbcRequest(String.format(query3, TEST_INDEX_PHRASE));
assertEquals(result1.toString(), result2.toString());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEquals(result1, result2) should work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to just do assertEquals(result1, result2) originally but the test was failing with the following message (I separated it onto different lines so it's more readable)...

expected: 
org.json.JSONObject<{"schema":[{"name":"phrase","type":"text"}],"total":2,"datarows":[["quick fox"],["quick fox here"]],"size":2,"status":200}> 
but was: 
org.json.JSONObject<{"schema":[{"name":"phrase","type":"text"}],"total":2,"datarows":[["quick fox"],["quick fox here"]],"size":2,"status":200}>

I wasn't sure why it was failing, but I assumed it may have something to do with the equality operator and how the objects might be stored in memory, so I added the .toString() as a hacky way to fix it. Is there a different way to compare the JSON objects that might be better?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that it is better to use JSONObject::similar instead of JSONObject::equals. So assert be like

assertTrue(result1.similar(result2));

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed. assertEquals was replaced with assertTrue.

@Test
public void test_SyntaxCheckException_when_invalid_parameter_match_phrase_syntax() {
List<Expression> arguments = List.of(
dsl.namedArgument("field", "test"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use DSL.namedArgument. Either works now, but dsl.namedArgument is going away immanently upstream.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed

Copy link

@MaxKsyunz MaxKsyunz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a feature, this looks good @GabeFernandez310!

I do have a few stylistic comments:

  1. MatchPhraseQueryTest.this. ... looks odd. Is it necessary?
  2. By convention, when writing an assert the first argument is the expected value. Some of your asserts have that flipped.
  3. assertEquals(result1.toString(), result2.toString()) is unnecessary -- JSONObject overrides equals directly.
  4. Use DSL.namedArgument instead of dsl.namedArgument. I realize the later is all over the codebase, but it is unnecessary -- namedArgument is a static method, and will go away in upstream imminently.

I flagged at least one example of each with a comment, but I see there are others instances.

List<Expression> arguments = List.of();
assertThrows(SyntaxCheckException.class,
() -> matchPhraseQuery.build(new MatchPhraseExpression(
arguments, MatchPhraseQueryTest.this.matchPhraseWithUnderscoreName)));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
arguments, MatchPhraseQueryTest.this.matchPhraseWithUnderscoreName)));
arguments, matchPhraseWithUnderscoreName)));

isn't that sufficient?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed

@@ -448,6 +466,15 @@ private static Stream<String> generateMatchPhraseQueries() {
return generateQueries("match_phrase", matchPhraseArgs);
}

private static Stream<String> generateMatchPhraseQueryQueries() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both generateMatchPhraseQueryQueries and `matchPhraseQueryComplexQueries?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking closer I felt that they weren't both needed. I removed generateMatchPhraseQueryQueries

@GabeFernandez310 GabeFernandez310 force-pushed the dev-add-legacy-syntax-for-matchphrase-function branch from 38b7714 to 72580ed Compare November 21, 2022 15:59
@GabeFernandez310 GabeFernandez310 marked this pull request as draft November 21, 2022 16:01
Signed-off-by: GabeFernandez310 <gabrielf@bitquilltech.com>
Signed-off-by: GabeFernandez310 <gabrielf@bitquilltech.com>
@GabeFernandez310 GabeFernandez310 force-pushed the dev-add-legacy-syntax-for-matchphrase-function branch from 72580ed to 9e112a3 Compare November 21, 2022 16:14
Signed-off-by: GabeFernandez310 <gabrielf@bitquilltech.com>
@GabeFernandez310 GabeFernandez310 force-pushed the dev-add-legacy-syntax-for-matchphrase-function branch from 9e112a3 to fbcee86 Compare November 21, 2022 17:00
@GabeFernandez310 GabeFernandez310 marked this pull request as ready for review November 21, 2022 17:15
@GabeFernandez310 GabeFernandez310 merged commit 9c769dd into integ-add-legacy-syntax-for-matchphrase-function Nov 24, 2022
@GabeFernandez310 GabeFernandez310 deleted the dev-add-legacy-syntax-for-matchphrase-function branch December 1, 2022 05:36
andy-k-improving pushed a commit that referenced this pull request Nov 16, 2024
* Import h3 library (#154)

Made following changes to make it compatible:
1. Rename package from elasticsearch to opensearch.geospatial
2. Update License headers
3. Update build file
4. Update settings to include sub projects

* Use Transport Request (#164)

Remove usage of deprecated BaseNodeRequest

* Update http client package to resolve build failure (#168) (#171)

* Introduce H3 min resolution constant (#165)

H3 version 1 has 16 resolutions, numbered 0 through 15.
Introduced a constant to represent min value, similar
to max value.

* Add geohex aggregation (#160)

This aggregation will use uber's h3 to group coordinates into H3 cell.
Created new aggregation type geohex_grid. The precision will be between
0 and 15. This aggreation has default precision as 5,
similar to geohash and geotile.

Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>

* Add integration test (#176)

Included integration test for geohex_grid.

Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants