Releases: opensearch-project/opensearch-spark
Releases · opensearch-project/opensearch-spark
Version 0.6.0
What's Changed
- Translate PPL
dedup
Command Part 1: allowedDuplication=1 by @LantaoJin in #521 - Bump Flint version to 0.6.0 by @penghuo in #542
- Some fixes for the README by @salyh in #546
- Add statement timeout by @noCharger in #539
- Remove query rewrite for LogsTable skipping index by @seankao-az in #552
- Fix refresh policy back to WAIT_FOR other than writing query result by @ykmr1224 in #554
- Lateral
eval
expressions are supported after Spark upgrading by @LantaoJin in #544 - Add rate limiter for bulk request by @ykmr1224 in #567
- Abstract service for accessing Flint index metadata by @seankao-az in #495
- Update PPL describe command by @YANG-DB in #573
- Implement stddev_samp and stddev_pop ppl stats function by @salyh in #549
Top
&Rare
PPL commands support by @YANG-DB in #568- Nested fields query fix by @YANG-DB in #564
- Add percentile PPL function by @salyh in #547
- Add conf for specifying flint checkpoint location by @noCharger in #577
- Fix distinct_count ppl stats function by @salyh in #548
- Retry bulk request to OpenSearch by @ykmr1224 in #572
- PPL Parse command by @YANG-DB in #595
- [Refactor] Extend REPL to support external metadata storage and data storage by @noCharger in #381
- Add support of GroK command including default patterns by @YANG-DB in #598
- Add UT and IT for 2+ level aggregations PPL command by @LantaoJin in #603
- Translate PPL
dedup
Command Part 2: allowedDuplication>1 by @LantaoJin in #543 - Remove checkpoint folder when vacuuming index by @dai-chen in #621
- Update checkpoint location on alter path by @noCharger in #616
- Refactor FlintJob with FlintStatement and StatementExecutionManager by @noCharger in #635
- Ppl patterns command by @YANG-DB in #627
- Disable timeout params for deleteIndex API only for serverless by @ykmr1224 in #646
- ispresent implemented as function by @lukasz-soszynski-eliatra in #651
- Add coalesce PPL command by @salyh in #609
- Fix ppl describe bug #612 by @YANG-DB in #656
- Add langType to FlintStatement model by @noCharger in #664
- Translate PPL
join
Command by @LantaoJin in #630 - Add support for serializing TimestampNTZType by @engechas in #673
- Isempty by @lukasz-soszynski-eliatra in #676
- Optional support for getAllIndexMetadata by index pattern by @seankao-az in #682
- Introduce async query scheduler by @noCharger in #668
- Add resultIndex to session manager extension by @noCharger in #689
- Support Fields Minus Command by @LantaoJin in #698
- Translate PPL
LOOKUP
Command by @LantaoJin in #686 - Import Lombok and override the equals and hashCode methods for expression and plan nodes by @LantaoJin in #703
- Implementation of case function. by @lukasz-soszynski-eliatra in #695
- Add explain command by @kt-eliatra in #687
- [REFACTOR] Move DF reformat from StatementExecutionManagerImpl to QueryResultWriterImpl by @noCharger in #701
- Re-select an excluded field should throw SyntaxCheckException by @LantaoJin in #707
- Support
InSubquery
in PPL by @LantaoJin in #714 - Antler syntax extensions related ot signum function. by @lukasz-soszynski-eliatra in #652
- Add rename PPL function by @kt-eliatra in #618
- update documentation with examples comment specifications markdown pages by @YANG-DB in #743
- update docs & links & add planned commands docs with process suggestion by @YANG-DB in #745
- Implement isBlank by @salyh in #749
- Fillnull command introduced by @lukasz-soszynski-eliatra in #723
- update MAINTAINERS.md by @YANG-DB in #756
- update code-owners by @YANG-DB in #761
- Support
ScalarSubquery
PPL by @LantaoJin in #752 - AND should have higher precedence than OR in predicate expression by @LantaoJin in #771
- Support table identifier contains dot with backticks by @LantaoJin in #768
- Support
ExistsSubquery
in PPL by @LantaoJin in #769 - Support recovery for index with external scheduler by @noCharger in #717
- Add auth-crt lib as runtime dependency by @noCharger in #778
- Lazy clean up dangling index metadata log entry by @dai-chen in #558
- [Infrastructure] Move style check ahead of Integ test by @LantaoJin in #783
- Support
RelationSubquery
PPL by @LantaoJin in #775 - Refactor datetime functions docs and IT by @LantaoJin in #787
- Add check list in pull request template by @LantaoJin in #794
- Support line comment and block comment in PPL by @LantaoJin in #792
- Implement Cryptographic hash functions by @Gokul-Radhakrishnan in #788
- Add query execution metrics by @ykmr1224 in #799
- Add Tomo as maintainer by @noCharger in #807
- Resolution for previous concurrency issue by @noCharger in #802
- Support alter refresh interval on external scheduler by @noCharger in #801
- Support PPL
JSON
functions: construction and extraction by @LantaoJin in #780 - PPL
fieldsummary
command by @YANG-DB in #766 - Add read/write bytes metrics by @ykmr1224 in #803
- Support
Eventstats
in PPL by @LantaoJin in #800 - Add checkpoint.delete.processingTime metric by @ykmr1224 in #817
- Add opensearch related metrics by @ykmr1224 in #818
- Fix copyright header missing by @LantaoJin in #813
All-fields
as an argument of aggregator such as count() can be resolved after otherfield
by @LantaoJin in #814- Support IN expression in PPL by @LantaoJin in #823
- Fi...
Version 0.5.0
What's Changed
- Fix incorrect result in show index statement by @dai-chen in #332
- Bump Flint version to 0.5.0 by @dai-chen in #343
- Enhance index monitor to terminate streaming job on consecutive errors by @dai-chen in #346
- Updating security reachout email by @varun-lodaya in #340
- Transition Flint index state to Failed upon refresh job termination by @dai-chen in #362
- Pre-validate duplicate columns in materialized view query by @dai-chen in #359
- Fix index state stuck in refreshing when streaming job exits early by @dai-chen in #370
- Read dataSourceName from
FlintOptions
and avoid passing as args by @seankao-az in #378 - Refactor static method for OpenSearch client utils by @seankao-az in #377
- [Refactor] Introduce flint-commons for models and interfaces by @noCharger in #373
- Extract metadata log operations from FlintClient into FlintMetadataLogService by @seankao-az in #379
- Support nested indexed field in Flint skipping index SQL statement by @dai-chen in #366
- Support custom metadata log service implementation by @seankao-az in #389
- Abstracting source relations for enhanced covering index rewriting by @dai-chen in #391
- Add OpenSearchCatalog to enable direct access OpenSearch index in Spark SQL by @penghuo in #399
- Enhance Flint Spark API error reporting with centralized handler by @dai-chen in #348
- Unquote text and identifiers in PPL parsing by @seankao-az in #393
- handle MetaException with glue AccessDeniedException by @noCharger in #410
- support shard level split on read path by @penghuo in #402
- Pre-validate checkpoint location write permission by @dai-chen in #414
- Separate metadata log entry data model and persistence by @seankao-az in #406
- Add scheduler_mode index option by @noCharger in #415
- Enhance query rewriter rule to support partial covering index by @dai-chen in #409
- Store error message for streaming job execution in Flint metadata log by @dai-chen in #433
- Add create Pit api and fix sigv4 bug by @penghuo in #434
- Support custom extension conf by @noCharger in #438
- Revert OpenSearch Version to 2.6 by @penghuo in #444
- Remove unimplemented syntax by @ykmr1224 in #439
- Update README to reflect available commands by @ykmr1224 in #447
- Add FlintJob integration test with EMR serverless by @penghuo in #449
- [Bugfix] Insights on query execution error by @noCharger in #475
- Disable unsupported PPL function expressions by @ykmr1224 in #478
- Add error output column to show Flint index statement by @dai-chen in #436
- [Doc] Checklist to fix issue "could not find Docker environment" on macOS by @LantaoJin in #477
- Translate PPL-builtin functions to Spark-builtin functions by @LantaoJin in #448
- Translate Eval Command by @LantaoJin in #499
- Fix SigV4 signature when connecting to OpenSearchServerless by @ykmr1224 in #473
- Support more PPL builtin functions by adding a name mapping by @LantaoJin in #504
- Add OpenSearchTable in flint core by @penghuo in #479
- Reorganize IT directory to prevent unintentional execution from UT by @dai-chen in #501
- Add config on query loop execution frequency by @noCharger in #411
- Use refresh policy from config by @ykmr1224 in #530
- Add PPL describe command by @YANG-DB in #541
- Terminate streaming job when index data is deleted by @dai-chen in #500
- Upgrade Spark 3.5.1 by @penghuo in #525
- [Backport 0.5] Add statement timeout by @opensearch-trigger-bot in #550
- [Backport 0.5] Remove query rewrite for LogsTable skipping index by @opensearch-trigger-bot in #553
- [Backport 0.5] Fix refresh policy back to WAIT_FOR other than writing query result by @opensearch-trigger-bot in #557
- [Backport 0.5] Lateral
eval
expressions are supported after Spark upgrading by @opensearch-trigger-bot in #561 - [Backport 0.5] Translate PPL
dedup
Command Part 1: allowedDuplication=1 by @opensearch-trigger-bot in #566 - [Backport 0.5] Add rate limiter for bulk request by @opensearch-trigger-bot in #571
- [Backport 0.5] Abstract service for accessing Flint index metadata by @opensearch-trigger-bot in #575
- [Backport 0.5] Update PPL describe command by @opensearch-trigger-bot in #578
- [Backport 0.5] Implement stddev_samp and stddev_pop ppl stats function by @opensearch-trigger-bot in #581
- [Backport 0.5]
Top
&Rare
PPL commands support by @opensearch-trigger-bot in #583 - [Backport 0.5] Nested fields query fix by @opensearch-trigger-bot in #585
- [Backport 0.5] Add percentile PPL function by @opensearch-trigger-bot in #587
- [Backport 0.5] Add conf for specifying flint checkpoint location by @opensearch-trigger-bot in #589
- [Backport 0.5] Fix distinct_count ppl stats function by @opensearch-trigger-bot in #590
- [Backport 0.5] PPL Parse command by @opensearch-trigger-bot in #597
- [Backport 0.5] Add support of GroK command including default patterns by @opensearch-trigger-bot in #610
- [Backport 0.5] Translate PPL
dedup
Command Part 2: allowedDuplication>1 by @opensearch-trigger-bot in #615 - [Backport 0.5] Add UT and IT for 2+ level aggregations PPL command by @opensearch-trigger-bot in #613
- [Backport 0.5] Remove checkpoint folder when vacuuming index by @opensearch-trigger-bot in #629
- [Backport 0.5] [Refactor] Extend REPL to support external metadata storage and data storage by @opensearch-trigger-bot in #604
- [Backport 0.5] Update checkpoint location on alter path by @opensearch-trigger-bot in #631
- [Backport 0.5] Refactor FlintJob with FlintStatement and StatementExecutionManager by @opensearch-trigger-bot in #636
- [Backport 0.5] Ppl patterns command by @opensearch-trigger-bot in #639
- [Backport 0.5] Disable timeout params for deleteIndex API only for serverless by @opensearch-trigger-bot in #649
- [Backport 0.5] Add langType to FlintStatement model by @opensearch-trigger-bot in #665
- [Back...
Version 0.4.1
What's Changed
- Bump Flint version to 0.4.1 by @dai-chen in #360
- [Backport 0.4] Transition Flint index state to Failed upon refresh job termination by @opensearch-trigger-bot in #364
- [Backport 0.4] Fix index state stuck in refreshing when streaming job exits early by @opensearch-trigger-bot in #374
- [Backport 0.4] Support nested indexed field in Flint skipping index SQL statement by @opensearch-trigger-bot in #388
- [Backport 0.4] Enhance Flint Spark API error reporting with centralized handler by @dai-chen in #401
- [Backport 0.4] handle MetaException with glue AccessDeniedException by @opensearch-trigger-bot in #412
- [Backport 0.4] [Bugfix] Insights on query execution error by @opensearch-trigger-bot in #486
Full Changelog: v0.4.0...v0.4.1
Version 0.4.0
What's Changed
- Improve pre-validation for Flint index refresh options by @dai-chen in #297
- Remove query log from job executor by @seankao-az in #308
- Adding support to run integ tests on iceberg tables by @asuresh8 in #301
- Bump Flint version to 0.4.0 by @seankao-az in #311
- Allow non-existent checkpoint location path in index validation by @dai-chen in #313
- Clean shuffle data by @penghuo in #312
- Introduce aws sigv4a request signer by @noCharger in #303
- Add covering index based query rewriter rule by @dai-chen in #318
- Add maxExecutors configuration for streaming queries by @penghuo in #326
- add batch_bytes configuration for Flint by @penghuo in #329
- Improve flint error handling by @noCharger in #335
- Apply new logging format to record exceptions by @noCharger in #314
- [Backport 0.4] Enhance index monitor to terminate streaming job on consecutive errors by @opensearch-trigger-bot in #347
Full Changelog: v0.3.0...v0.4.0
Version 0.3.0
What's Changed
- Bump Flint version to 0.3.0 by @penghuo in #258
- Add sql grammar support for show flint index statement by @seankao-az in #266
- Implement BloomFilter query rewrite (without pushdown optimization) by @dai-chen in #248
- Refactor flint log format by @noCharger in #263
- Implement BloomFilter query pushdown optimization by @dai-chen in #271
- Add grammar files for alter index by @seankao-az in #279
- Implement adaptive BloomFilter algorithm by @dai-chen in #251
- Move query from entry point to SparkConf by @noCharger in #274
- Fix spark extension path in README. by @asuresh8 in #282
- Implement show flint index statement by @seankao-az in #276
- Add BloomFilter skipping index SQL support by @dai-chen in #283
- Implement analyze skipping index statement by @rupal-bq in #284
- Reduce default inactivity limit to 3min by @penghuo in #287
- Fix shutdown bug due to non-daemon thread in driver by @kaituo in #292
- Rule out logical deleted skipping index in query rewrite by @dai-chen in #289
- Ignore non-Flint index in show and describe index statement by @dai-chen in #296
- Implement Alter Index SQL statement by @seankao-az in #286
- Welcome new maintainer Louis Chu by @penghuo in #299
- Add AWS credentials provider for metadata access by @noCharger in #285
- Unescape query from EMR spark submit parameter by @seankao-az in #306
- [Backport 0.3] Remove query log from job executor by @opensearch-trigger-bot in #310
- [Backport 0.3] Clean shuffle data by @opensearch-trigger-bot in #322
- [Backport 0.3] Introduce aws sigv4a request signer by @opensearch-trigger-bot in #323
- [Backport 0.3] Add maxExecutors configuration for streaming queries by @opensearch-trigger-bot in #328
- [Manual Backport 0.3] Improve flint error handling (#335) by @noCharger in #338
Full Changelog: v0.2.0...v0.3.0
Version 0.2.0
What's Changed
- Ppl spark join command by @YANG-DB in #69
- Fixed the GitHub id column for Yang-DB in Maintainers by @dtaivpp in #205
- Bump Flint version to 0.2.0 by @dai-chen in #183
- Add vacuum index API and SQL support by @dai-chen in #189
- Restrict the maximum size of value set by default limit by @dai-chen in #208
- bug fix, support array datatype in MV by @penghuo in #211
- Percent-encode invalid flint index characters by @seankao-az in #215
- GHA fix for backport and snapshot-publish by @seankao-az in #222
- Quote table name with backticks to escape special characters in spark.read.table() by @seankao-az in #224
- Configure value set max size in SQL statement by @dai-chen in #210
- Change delete index API to logical delete by @dai-chen in #191
- Trigger barkport workflow when a pull request merges by @noCharger in #230
- Changes for adding default dimensions in CWSink. by @vamsi-amazon in #209
- Refactor Flint index refresh mode by @dai-chen in #228
- Fix Lychee Link Checker Error by @noCharger in #236
- Add OpenSearch metrics by @noCharger in #229
- Fix recover index bug when Flint data index is deleted accidentally by @dai-chen in #241
- Support struct field as indexed column by @dai-chen in #213
- Implement BloomFilter skipping index building logic by @dai-chen in #242
- Support dimension sets in config by @noCharger in #238
- Support on-demand incremental refresh by @dai-chen in #234
- Fix Session state bug and improve Query Efficiency in REPL by @kaituo in #245
- Add interactive job metrics by @noCharger in #240
- Add more flint metrics by @noCharger in #255
Version 0.1.0
Preview release