Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support user-defined and incomplete date formats (#273) #1821

Merged

Conversation

Yury-Fridlyand
Copy link
Collaborator

@Yury-Fridlyand Yury-Fridlyand commented Jul 8, 2023

Description

  • Added support user defined (custom) date formats (backed by java syntax).
  • Updated support for predefined formats (doc, list, another doc).
  • Combinations of custom and named formats are also supported.
  • Incomplete date formats are also supported (e.g. year, week_year), but limitations applied. See below Limitations section.

Please, see team review and discussion in Bit-Quill#273

Fixes #126, related issue: #794

Notes

Limitations

Bit-Quill#273 (comment)
date/time part from custom formats which does not provide complete date/time part could be lost.
For example:
"1999-10-20 15" with format "yyyy-MM-dd mm" parsed as 1999-10-20 00:00:00
"1999 10:20" with format "yyyy HH:mm" parsed as 1970-01-01 10:20:00
This comes from OpenSearch core libs, track opensearch-project/OpenSearch#8689.

Sample for test

Mapping:

{
  "mappings" : {
    "properties" : {
      "custom_date" : {
        "type" : "date",
        "format" : "yyyy-MM-dd"
      }
   }
}

Data:

{"index": {}}
{"epoch_millis": "1984-04-12"}

Sample 2

Mapping

{
    "mappings":
    {
        "properties":
        {
            "custom_time" :
            {
                "type" : "date",
                "format" : "::: k-A || A    "
            },
            "incomplete_1" :
            {
                "type" : "date",
                "format" : "year"
            },
            "incomplete_2" :
            {
                "type" : "date",
                "format" : "E-w"
            },
            "incomplete_custom_date" :
            {
                "type" : "date",
                "format" : "uuuu"
            },
            "incomplete_custom_time" :
            {
                "type" : "date",
                "format" : "HH"
            },
            "incorrect" :
            {
                "type" : "date",
                "format" : "'___'"
            },
            "epoch_sec" :
            {
                "type" : "date",
                "format" : "epoch_second"
            },
            "epoch_milli" :
            {
                "type" : "date",
                "format" : "epoch_millis"
            },
            "custom_no_delimiter_date" :
            {
                "type" : "date",
                "format" : "uuuuMMdd"
            },
            "custom_no_delimiter_time" :
            {
                "type" : "date",
                "format" : "HHmmss"
            },
            "custom_no_delimiter_ts" :
            {
                "type" : "date",
                "format" : "uuuuMMddHHmmss"
            }
        }
    }
}

Data

{"index": {}}
{ "custom_time":  "85476321", "incomplete_1" : 1984, "incomplete_2": null, "incomplete_custom_date": 1999, "incomplete_custom_time" : 10, "incorrect" : null, "epoch_sec" : 42, "epoch_milli" : 42, "custom_no_delimiter_date" : "19841020", "custom_no_delimiter_time" : "102030", "custom_no_delimiter_ts" : "19841020153548" }
{"index": {}}
{ "custom_time":  "::: 9-32476542", "incomplete_1" : 2022, "incomplete_2": null, "incomplete_custom_date": 3021, "incomplete_custom_time" : 20, "incorrect" : null, "epoch_sec" : 100500, "epoch_milli" : 100500, "custom_no_delimiter_date" : "19610412", "custom_no_delimiter_time" : "090700", "custom_no_delimiter_ts" : "19610412090700" }

Query

SELECT * FROM test

Result set

+--------------------------+-----------+-------------------------+---------------------+------------------------+--------------+------------------------+--------------------------+--------------+------------------------+---------------------+ 
| custom_no_delimiter_date | incorrect | epoch_milli             | epoch_sec           | incomplete_custom_time | custom_time  | custom_no_delimiter_ts | custom_no_delimiter_time | incomplete_2 | incomplete_custom_date | incomplete_1        |
+--------------------------+-----------+-------------------------+---------------------+------------------------+--------------+------------------------+--------------------------+--------------+------------------------+---------------------+
| date                     | timestamp | timestamp               | timestamp           | time                   | time         | timestamp              | time                     | timestamp    | date                   | timestamp           |
+--------------------------+-----------+-------------------------+---------------------+------------------------+--------------+------------------------+--------------------------+--------------+------------------------+---------------------+
| 1984-10-20               | null      | 1970-01-01 00:00:00.042 | 1970-01-01 00:00:42 | 10:00:00               | 23:44:36.321 | 1984-10-20 15:35:48    | 10:20:30                 | null         | 1999-01-01             | 1984-01-01 00:00:00 |
| 1961-04-12               | null      | 1970-01-01 00:01:40.5   | 1970-01-02 03:55:00 | 20:00:00               | 09:01:16.542 | 1961-04-12 09:07:00    | 09:07:00                 | null         | 3021-01-01             | 2022-01-01 00:00:00 |
+--------------------------+-----------+-------------------------+---------------------+------------------------+--------------+------------------------+--------------------------+--------------+------------------------+---------------------+

Sample 3

Use calcs from IT

Query

select TYPEOF(date0), TYPEOF(date1), TYPEOF(date2), TYPEOF(date3), TYPEOF(time0), TYPEOF(time1), TYPEOF(datetime0), TYPEOF(datetime1) from calcs limit 1;

Result set

fetched rows / total rows = 1/1
-[ RECORD 1 ]-------------------------
TYPEOF(date0)     | DATE
TYPEOF(date1)     | DATE
TYPEOF(date2)     | DATE
TYPEOF(date3)     | DATE
TYPEOF(time0)     | TIMESTAMP
TYPEOF(time1)     | TIME
TYPEOF(datetime0) | TIMESTAMP
TYPEOF(datetime1) | TIMESTAMP

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

* Check custom formats for characters

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Removed duplicated code

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Changed check for time

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>
@codecov
Copy link

codecov bot commented Jul 8, 2023

Codecov Report

Merging #1821 (56e5621) into main (a816a58) will increase coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head 56e5621 differs from pull request most recent head be8c34f. Consider uploading reports for the commit be8c34f to get more accurate results

@@             Coverage Diff              @@
##               main    #1821      +/-   ##
============================================
+ Coverage     97.33%   97.35%   +0.01%     
- Complexity     4490     4524      +34     
============================================
  Files           394      394              
  Lines         11118    11180      +62     
  Branches        795      812      +17     
============================================
+ Hits          10822    10884      +62     
  Misses          289      289              
  Partials          7        7              
Flag Coverage Δ
sql-engine 97.35% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...h/sql/opensearch/data/type/OpenSearchDateType.java 100.00% <100.00%> (ø)
...nsearch/data/value/OpenSearchExprValueFactory.java 100.00% <100.00%> (ø)

forestmvey
forestmvey previously approved these changes Jul 10, 2023
penghuo
penghuo previously approved these changes Jul 10, 2023
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
@Yury-Fridlyand Yury-Fridlyand dismissed stale reviews from penghuo, GumpacG, and forestmvey via be8c34f July 10, 2023 22:22
@MaxKsyunz MaxKsyunz merged commit a60b222 into opensearch-project:main Jul 11, 2023
@MaxKsyunz MaxKsyunz deleted the integ-custom-datetime-formats branch July 11, 2023 06:14
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 11, 2023
* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Removed duplicated code

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Changed check for time

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Refactoring.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Guian Gumpac <guian.gumpac@improving.com>
(cherry picked from commit a60b222)
Yury-Fridlyand added a commit that referenced this pull request Jul 11, 2023
* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Removed duplicated code

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Changed check for time

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Refactoring.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Guian Gumpac <guian.gumpac@improving.com>
(cherry picked from commit a60b222)

Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 11, 2023
* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Removed duplicated code

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Changed check for time

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Refactoring.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Guian Gumpac <guian.gumpac@improving.com>
(cherry picked from commit a60b222)
forestmvey pushed a commit that referenced this pull request Jul 11, 2023
* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Removed duplicated code

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Changed check for time

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Refactoring.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Guian Gumpac <guian.gumpac@improving.com>
(cherry picked from commit a60b222)

Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>
MitchellGale pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Jul 11, 2023
…roject#1821)

* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Removed duplicated code

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Changed check for time

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Yury-Fridlyand <yury.fridlyand@improving.com>

* Refactoring.

Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>

---------

Signed-off-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Yury-Fridlyand <yury.fridlyand@improving.com>
Co-authored-by: Guian Gumpac <guian.gumpac@improving.com>
Signed-off-by: Mitchell Gale <Mitchell.Gale@improving.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Allowing date in format yyyy-MM-dd HH:mm:ss.SSSSSS
5 participants