Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to swap arguments in lucene queries #2002

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Yury-Fridlyand
Copy link
Collaborator

Description

Lucene queries accept field reference (column/field name) at the first arg only (on the left side). If reference is on the right - query fails or performed with significant overhead.

Example 1

opensearchsql> select * from bank where address = '282 Kings Place';
Output longer than terminal width
Do you want to display data vertically for better visual effect? [y/N]:
fetched rows / total rows = 0/0
+------------------+-------------+-----------+-------------+----------+--------+------------+-----------+------------+---------+----->
| account_number   | firstname   | address   | birthdate   | gender   | city   | lastname   | balance   | employer   | state   | age >
|------------------+-------------+-----------+-------------+----------+--------+------------+-----------+------------+---------+----->
+------------------+-------------+-----------+-------------+----------+--------+------------+-----------+------------+---------+----->

(query succeeded)

opensearchsql> select * from bank where '282 Kings Place' = address;
TransportError(503, 'SearchPhaseExecutionException', {'error': {'reason': 'Error occurred in OpenSearch engine: all shards failed', 'details': 'Shard[0]: java.lang.IllegalArgumentException: Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [address] in order to load field data by uninverting the inverted index. Note that this can use significant memory.\n\nFor more details, please send request for Json format to see the raw response from OpenSearch engine.', 'type': 'SearchPhaseExecutionException'}, 'status': 400})

(query failed)

Example 2

Query 1

select account_number from bank where age = 30;

Explain

{
  "from": 0,
  "size": 200,
  "timeout": "1m",
  "query": {
    "term": {
      "age": {
        "value": 30,
        "boost": 1
      }
    }
  },
  "_source": {
    "includes": [
      "account_number"
    ],
    "excludes": []
  },
  "sort": [
    {
      "_doc": {
        "order": "asc"
      }
    }
  ]
}

Query 2

select account_number from bank where 30 = age;

Explain

{
  "from": 0,
  "size": 200,
  "timeout": "1m",
  "query": {
    "script": {
      "script": {
        "source": " ... huge serialized script contains encoded function '=(30, age)'",
        "lang": "opensearch_query_expression"
      },
      "boost": 1
    }
  },
  "_source": {
    "includes": [
      "account_number"
    ],
    "excludes": []
  },
  "sort": [
    {
      "_doc": {
        "order": "asc"
      }
    }
  ]
}

With proposed changes explain for both queries in the second example will be same. Query in the first example would work properly.

Notes

This is a PoC. WIP.
With a properly wrapped lucene query we can avoid storing (and using) castMap in LuceneQuery, which completely duplicates TypeCastOperator.

This should be done before or together with moving OpenSearch functions out of :core. Ref: UDF (User Defined Functions), #811.

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant