-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement incremental refresh for single-table, predicate-only MVs #20959
Conversation
8ff865b
to
53e7cc1
Compare
@sopel39 @alexjo2144 please take a look |
@findepi @sopel39 @alexjo2144 Hi team, could you please take a quick look at this PR? Thank you very much! |
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergMaterializedViewTest.java
Show resolved
Hide resolved
@martint as discussed in our zoom call, I've refactored this PR so that the analyzer passes the information to the connector to make the incremental/full refresh decision, as opposed to using table properties. Please take a look and let me know what you think. Thank you! |
core/trino-main/src/main/java/io/trino/sql/analyzer/Analysis.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergUtil.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments
core/trino-main/src/main/java/io/trino/sql/planner/IncrementalRefreshVisitor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/IncrementalRefreshVisitor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/IncrementalRefreshVisitor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/IncrementalRefreshVisitor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/LogicalPlanner.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergMaterializedView.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergMaterializedView.java
Outdated
Show resolved
Hide resolved
@@ -86,6 +86,189 @@ protected String getStorageMetadataLocation(String materializedViewName) | |||
return table.getParameters().get(METADATA_LOCATION_PROP); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also add a test with update/delete
. Part of scenario would be that incremental refresh should work again after compaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a new test case which verifies that we fall back to FULL refresh for cases when delete files are present. As for compaction reenabling incremental refresh, please see my comment below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for compaction reenabling incremental refresh, please see my comment below
Would a test like that work:
- initial (full) refresh with some data
- add one row to source table
- incremental refresh
- compaction
- full refresh
- add one row to source table
- we are back to incremental refresh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good, I would just modify step 4:
- initial (full) refresh with some data
- add one row to source table
- incremental refresh
- update or delete one row
- full refresh
- add one row to source table
- we are back to incremental refresh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the original test case you proposed, step 5 shouldn't do any refresh (essentially a noop), because step 4 (compaction) didn't add any new rows, it just binpacked the existing rows into fewer files. Nonetheless, that's also something worthwhile to test, so added an extra test to verify this assumption.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitManager.java
Outdated
Show resolved
Hide resolved
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Marton Bod.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments
core/trino-main/src/main/java/io/trino/sql/planner/IncrementalRefreshVisitor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/planner/TestIncrementalRefreshVisitor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/planner/TestIncrementalRefreshVisitor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/planner/TestMaterializedViews.java
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/planner/TestMaterializedViews.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitManager.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitManager.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitManager.java
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergMaterializedView.java
Outdated
Show resolved
Hide resolved
@@ -86,6 +86,189 @@ protected String getStorageMetadataLocation(String materializedViewName) | |||
return table.getParameters().get(METADATA_LOCATION_PROP); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for compaction reenabling incremental refresh, please see my comment below
Would a test like that work:
- initial (full) refresh with some data
- add one row to source table
- incremental refresh
- compaction
- full refresh
- add one row to source table
- we are back to incremental refresh
core/trino-main/src/test/java/io/trino/sql/planner/TestMaterializedViews.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergMaterializedView.java
Outdated
Show resolved
Hide resolved
* {@code refreshType} is a signal from the engine to the connector whether the MV refresh could be done incrementally or only fully, based on the plan. | ||
* The connector is not obligated to perform the refresh in the fashion prescribed by {@code refreshType}, this is merely a hint from the engine. | ||
*/ | ||
default ConnectorInsertTableHandle beginRefreshMaterializedView(ConnectorSession session, ConnectorTableHandle tableHandle, List<ConnectorTableHandle> sourceTableHandles, RetryMode retryMode, RefreshType refreshType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martint do you want to ack on SPI change?
75521bb
to
e2aef8d
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitManager.java
Outdated
Show resolved
Hide resolved
When MV refresh is executing, the planner suggests either incremental or full refresh to the connector. In this first phase, incremental refresh is suggested only when Scan/Filter/Project nodes are present in the plan tree - otherwise full refresh. The Iceberg connector can act on this signal and use IncrementalAppendScan to scan the 'delta' records only and append them to the MV storage table (without truncation).
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! Couple of remaining comments
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorMetadata.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/tracing/TracingConnectorMetadata.java
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/planner/TestMaterializedViews.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/BaseIcebergMaterializedViewTest.java
Outdated
Show resolved
Hide resolved
package io.trino.spi; | ||
|
||
/** | ||
* Signifies whether a Materialized View refresh operation could be done incrementally or only fully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference: semantics of this will most likely change when we support full scope of incremental refreshes
@marton-bod are you planning to send a small separate PR to document this? |
@mosabua yes, I'll post a PR with the proposed doc changes. Thanks for bringing it up |
* Add additional test coverage for integer number to varchar coercer Co-authored-by: Yiqun Zhang <guiyanakuang@gmail.com> * Allow integer number to varchar coercion in ORC unpartitioned table Co-authored-by: Yiqun Zhang <guiyanakuang@gmail.com> * Allow integer number to varchar coercion in Parquet unpartitioned table * Quote current test param when test fails * Use enhanced switch statement * Remove redundant explicit generics * Extract a dedicated Double to Varchar Coercer for ORC * Allow double to varchar coercion in Parquet unpartitioned table * Merge Hudi query runners into one * Add support for case insensitive name cache in BigQuery Co-authored-by: regadas <oss@regadas.email> * Fix failure when reading signed timestamp with time zone stats in Delta Lake * Avoid org.locationtech.jts:jts-core exclusion in Pinot * Improve 449 release notes * Add docs for Hive metadata caching config * Remove exceptions related to older version of Hive * Simplify BaseTestHiveCoercion Older version of Hive doesn't support Float/Real type for parquet table format, but Hive 3.0+ doesn't have such restriction. * Allow float to varchar coercion in hive table This coercion would work for partitioned tables for all formats and for unpartitioned tables it would work for ORC and Parquet table format. * Improve error message for resource groups * Gracefully handle missing spooling output stats in FTE scheduler It is rare but possible to get empty spooling output stats for task which completed successfully. This may happen if we observe FINISHED task state based on received TaskStatus but are later on unable to successfully retrieve TaskInfo. In such case we are building final TaskInfo based on last known taskInfo, just updating the taskState field. The spooling output stats will not be present. As we need this information in FTE mode we need to fail such task artificially * Add JDBC property to use current catalog in metadata if none provided Some BI tools don't pass a `catalog` when calling the `DatabaseMetaData` `getTables`, `getColumns` and `getSchemas` methods. This makes the JDBC driver search across all catalogs which can be expensive. This commit introduces a new boolean connection property `assumeNullCatalogMeansCurrentCatalog` (disabled by default) to be used with such BI tools. If enabled the driver will try to use current `catalog` of the JDBC connection when fetching Trino metadata like `getTables`, `getColumns`, `getSchemas` if the `catalog` argument to those methods is passed as `null`. Co-authored-by: Rafał Połomka <polomek@gmail.com> Co-authored-by: Ashhar Hasan <ashhar.hasan@starburstdata.com> * Make sealed version of Avro Logical Types supported natively * Refactor block building decoder creation to connector specific class * Remove unnecessary exclusions from Pinot * Remove unnecessary, commented code * Rewrite JUnit Assertions to AssertJ Changed automatically by openrewrite recipe ``` <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.JUnitToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.10.1</version> </dependency> </dependencies> </plugin> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven.site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates -pl '!:trino-product-tests,!:trino-product-tests-launcher,!:trino-server-rpm' ``` Some formatting was fixed manually for readability. Then, rewrite remaining JUnit Assertions to AssertJ manually, because some entries were not migrated by `./mvnw rewrite:run`. * Rewrite TestNG Assertions to AssertJ Changed automatically by openrewrite recipe openrewrite/rewrite-testing-frameworks#520 ``` <plugin> <groupId>org.openrewrite.maven</groupId> <artifactId>rewrite-maven-plugin</artifactId> <version>5.32.0</version> <configuration> <activeRecipes> <recipe>org.openrewrite.java.testing.assertj.TestNgToAssertj</recipe> </activeRecipes> </configuration> <dependencies> <dependency> <groupId>org.openrewrite.recipe</groupId> <artifactId>rewrite-testing-frameworks</artifactId> <version>2.11.0-SNAPSHOT</version> </dependency> </dependencies> </plugin> ``` run ``` ./mvnw rewrite:run -Dmaven.javadoc.skip=true -DskipTests=true -Dmaven.site.skip=true -Dmaven.artifact.threads=16 -T 1C -e -Dair.check.skip-all=true --no-snapshot-updates -pl '!:trino-server-rpm' ``` Some formatting was fixed manually for readability. Then, rewrite remaining TestNG Assertions to AssertJ manually, because some entries were not migrated by `./mvnw rewrite:run`. * Prefer AssertJ assertions over TestNG, JUnit * Update airbase to 157 * Update exec-maven-plugin to 3.3.0 * Pass additional type information in OrcColumn OrcTypeKind captures only data types details while other information like precision and scale are not a part of these information. * Fix unsupported reads of decimals as varchar in parquet * Allow decimal to varchar coercion for unpartitioned tables * Fix grammar error in error message * Simplify boolean comparison in getColumnMappingMode() * Allow char to varchar coercion for hive tables * Improve docs for HTTP server config * Verify spark compatibility in native-fs special chars test * Use Identifier.getValue when analyzing table function arguments Otherwise, getPartitionBy and getOrderBy methods in TableArgument SPI returns quoted column names. * Avoid redundant null check before sizeOf() * Remove unused method from BigQueryClient * Convert DeleteFile to record in Iceberg * Convert IcebergMaterializedViewDefinition to record * Convert IcebergPartitioningHandle to record * Convert IcebergWritableTableHandle to record * Convert TrinoSortField to record in Iceberg * Convert MemoryInsertTableHandle to record * Convert MemoryDataFragment to record * Test Memory connector config class * Test reading liquid clustering tables in Delta Lake * Add support for reading UniForm with Iceberg in Delta Lake * Skip unsupported variant type in Delta Lake * Reduce number of row groups in tests using small row groups Avoids creating too many splits when split offsets are populated * Fix detecting start of row group in parquet reader * Populate split offsets when writing orc/parquet files in iceberg Offsets can be used by readers to align splits with row-group/stripe boundaries * Use timestampColumnMapping in ClickHouse * Avoid using deprecated ClickHouseContainer class * Set air.compiler.fail-warnings as true in ClickHouse * Refactor TestMultipleDistinctAggregationToMarkDistinct Remove deprecated symbol usage. * Extract DistinctAggregationStrategyChooser Extract the logic to determine whether the direct distinct aggregation applicability, which can be reused in multiple optimiser rules. * Use maximum NDV symbol to choose distinct aggregation strategy * Move OptimizeMixedDistinctAggregations Move class before refactoring to preserve history * Support for multiple aggregations in OptimizeMixedDistinctAggregations Adds support for multiple distinct aggregations in OptimizeMixedDistinctAggregations`. * Replace optimizer.optimize-mixed-distinct-aggregations Replace optimizer.optimize-mixed-distinct-aggregations with a new optimizer.distinct-aggregations-strategy `pre_aggregate`. Also rename corresponding config property optimizer.mark-distinct-strategy to optimizer.distinct-aggregations-strategy and values to NONE -> SINGLE_STEP and ALWAYS -> MARK_DISTINCT * Enable OptimizeMixedDistinctAggregations automatically Use estimated aggregation source NDV and the number of grouping keys to decide if pre-aggregate strategy should be used for a given aggregation * Remove unnecessary override method isRemotelyAccessible in IcebergSplit * Add non-null check in MemoryConnector * Fix example in test guideline * Add support for TRUNCATE statement in memory connector * Remove unsupported version check and usage in Cassandra * Add support for truncate statement in Iceberg * Enable TopN for non-textual types for clickhouse-connector * Prevent OOM when expanding projection over BigQuery no columns scan * Improve ternary operator formatting in BigQuerySplitManager * Produce single Big Query split when row count is all that matters Similar to what we do in other connectors. There is no reason to create multiple splits just to return row count information. * Use safe idiom of getting the only element * Automatically configure BigQuery scan parallelism * Cleanup in OrcTypeTranslator Use enhanced switch statements and re-arrange code based on the source data type * Fix coercion gaps for Hive tables in ORC format There were a few difference on the coercion supported between partitioned and un-partitioned table for ORC format. * Make scheduling of remote accessible splits with addresses more strict UniformNodeSelector or FTE scheduler will schedule remote accessible splits on selected nodes if such nodes are available and only fallback to other nodes if preferred nodes are no longer part of cluster. Connector might have stale node information when creating splits which could result in choosing offline nodes. Additionally, in FTE mode nodes can go down so split addresses could no longer be valid then task is restarted. Additionally, this commit simplifies UniformNodeSelector optimizedLocalScheduling which was hard to reason about and was not taking advantages of recent improvements like adaptive split queue length. Co-authored-by: Karol Sobczak <napewnotrafi@gmail.com> * Enable query and task retries in MariaDbClient Copy test overrides from 7b80852, as MariaDB also needs them. * Remove unused function parameter * Keep separate top-level classes in separate compilation units It's considered anti-pattern to have multiple non-nested classes in one .java file. We even have a checker for that, but it was suppressed in this case. * Remove redundant cast * Use anonymous variables * Cleanup warnings on top-level `@SuppressWarnings` should be applied selectively, not for the whole compilation unit. * Make helper methods static * Remove redundant throws declaration * Allow pinning to WebIdentityTokenCredentialsProvider Allow users to only use the WebIdentityTokenCredentialsProvider instead of the default credentials provider chain. * Extract STS client getter * Allow pinning to WebIdentityTokenCredentialsProvider Allow users to only use the WebIdentityTokenCredentialsProvider instead of the default credentials provider chain. * Extract constant to represent partition for non-partitioned Hudi table Co-Authored-By: Yuya Ebihara <ebyhry@gmail.com> * Small code fixes * remove dead code * fix comment typo * Improve performance of json functions Avoid allocating heap ByteBuffer used by InputStreamReader. * Fix broken testTruncateTable in Iceberg Snowflake catalog * Explicitly configure executorService for s3 multipartuploads Previously used forkjoin common pool meant for cpu bound operations * Increase s3.max-connections from AWS default of 50 to 500 Aligns native S3FileSystem with legacy TrinoS3FileSystem * Verify body in doScheduleAsyncCleanupRequest We observed rare cases when we got non-JSON result in doScheduleAsyncCleanupRequest and current logging does not provide enough information to understand the root cause. * Fix uncheckedCacheGet javadoc The `uncheckedCacheGet` will propagate `ExecutionError` if `Cache.get` throws it. This reverts commit 3d0632d. * Add Iceberg query runner with TASK retries enabled * Add `exchange.azure.endpoint` configuration option This option can be used instead of `exchange.azure.connection-string` and will use the default azure credentials when set. * Fix failure when reading non-primitive Parquet fields in Iceberg * Skip getting stats on varbinary type in Delta * Rename tests in TestDeltaLakeParquetStatisticsUtils * Add support for stats on timestamp type in Delta Lake * Fix format of `@DefunctConfig` values in Cassandra * Add support for typeWidening(-preview) reader feature in Delta Lake * Add test for HudiPlugin * Run TestPhoenixConnectorTest in parallel * Nessie: Support APIv2 client By default, Nessie API version will be inferred from Nessie URI. If the User has a custom URI which doesn't have version info in the URI, user can configure `iceberg.nessie-catalog.client-api-version` * Update netty to 4.1.111.Final * Update AWS SDK v2 to 2.26.0 * Update nessie to 0.90.2 * Update AWS SDK v1 to 1.12.741 * Update reactor-core to 3.6.7 * Update checker-qual to 3.44.0 * Update flyway to 10.15.0 * Update MongoDB to 5.1.1 * Update lucene-analysis-common to 9.11.0 * Update hudi to 0.15.0 * Update freemarker to 2.3.33 * Fix evictable cache invalidation race condition Before the change the following race was possible between threads A, B, C: - A calls invalidate(K) - A increments: invalidations++ - B changes state to be cached, and therefore calls invalidate(K) too - B increments: invalidations++ - C calls get(K) - C reads invalidations counter - C retrieves current token T for key K - C reads value V for T from cache - A reads and removes token T (same) for key K - B attempts to read and remove token for key K, not found - B exits invalidate(K) - C checks invalidations counter (didn't check) - C revives, i.e. re-inserts token T for key K - B calls get(K) - B retrieves token T (same) for key K - B reads value V for T from cache -- despite B having called invalidate(K) At least in this situation the problem is transient. Thread A will momentarily invalidate dataCache for token T, completing the invalidation. This commit fixes this. The bug was introduced by token reviving (commit 17faae3). This commit reverts that one and provides a different solution to the problem that commit was solving. * Pin 3rd party CI action versions GitHub allows to delete and re-publish a tag, so referencing 3rd party action by tag name should be discouraged. * Add encoding to error code in OAuth2 callback handler * Rearchitect IR optimizer and evaluator Model it as a collection of independent transformations, similar to the plan optimizer. * Distribute comparisons over case and switch Transforms expressions such as: a = case when c1 then r1 when c2 then r2 else r3 end into case when c1 then a = r1 when c2 then a = r2 else a = r3 end Additionally, simplifies expressions such as: case when c then true else false end * Enable Dependabot for GitHub Actions Dependabot will automatically open pull requests to update GitHub Actions. The main benefit of this is knowing that new versions are available, especially if third-party GitHub Actions are pinned to specific versions. Running it weekly should give enough time to review updates, and also react switfly to potential security updates. * Avoid abort in testing helper method or for-loop * Convert ResourceGroupInfo to a record class * Make sure expression type does not change after processing with rule * modify alluxio version to stable 2.9.4 * Include detailed error in S3 batch delete message * Clarify the default value of iceberg.catalog.type in Iceberg docs * Remove unused argument from getSparkTableStatistics * Fix description in Redis plugin * Optimized DATE_TRUNC function * Update libraries-bom to 26.41.0 * Update clickhouse-jdbc to 0.6.1 * Update nimbus-jose-jwt to 9.40.0 * Update AWS SDK v2 to 2.26.1 * Update GCS connector to hadoop3-2.2.23 * Update AWS SDK v1 to 1.12.742 * Update nessie to 0.90.4 * Update elasticsearch to 7.17.22 * Set hive.metastore.catalog.dir in Iceberg and Delta query runners * Remove redundant annotation * Use takari-smart-builder It's a default builder for mvnd * Replace air.main.basedir with session.rootDirectory * Update maven to 3.9.8 This version is aligned with mvnd-1.0.0 * Migrate iceberg caching product test to native filesystem * Fix argument order in ST_Point() function docs * Update docker-image version to 97 * Test allowColumnDefaults writer feature in Delta Lake * Check duplicated field names in Iceberg * Update jline to 3.26.2 * Update parquet to 1.14.1 * Improve code comment * Expose driver execution stats via JMX * Clean up code comments in JdbcJoinPushdownUtil * Clarify required configuration for fs caching * Track rate of pending acquires processing in BinPackingNodeAllocatorService * Add reason to pending acquires counters in BinPackingNodeAllocatorService * Install plugins in testing server before startup completes When startup completes (`StartupStatus.startupComplete()`), a worker will accept tasks (`TaskResource.failRequestIfInvalid`). The plugins and functions need to be installed before that happens. * Migrate unsupported writer version to integration test in Delta * Migrate unsupported writer feature to integration test in Delta * Remove testTrinoAlterTablePreservesChangeDataFeed testTrinoAlterTablePreservesTableMetadata verifies that table configuration is preserved. * Migrate append-only to integration test in Delta * Use correct isolation level in writing the commit info for DELETE * Add Trino 450 release notes * Document S3 max connections default change * [maven-release-plugin] prepare release 450 * [maven-release-plugin] prepare for next development iteration * Make AsyncResponse aware of the client disconnection With the current Jersey and Jetty implementations @suspended AsyncResponse can generate a lot of warnings that are caused by the fact that Jetty eagerly recycles request/response objects when the HTTP/2 RST_STREAM frame is sent by the client during request aborting. We abort in-flight requests for different reasons: either timeout is reached or when the client is closed during cleanup. For HTTP/1 that doesn't matter, but for HTTP/2 this will cause the AsyncResponse.resume to log an error, since the underlying request/response objects are already recycled and there is no client listening for the response. This change also cancels Future<?> bound to the AsyncContext since there is no point in processing it anymore. * Retain authorization during client redirects for basic auth * Create tables under volatile directory in TestDeltaLakeBasic catalogDir will be deleted after the test execution. * Fix wrong entry in 450 release note * Disallow row type without field names in Delta Lake * Check duplicated field names in Delta Lake Also, include exception message in DeltaLakeMetadata.addColumn. * Implement incremental refresh for single-table, predicate-only MVs (trinodb#20959) Add incremental refresh support for simple Iceberg MVs When MV refresh is executing, the planner suggests either incremental or full refresh to the connector. In this first phase, incremental refresh is suggested only when Scan/Filter/Project nodes are present in the plan tree - otherwise full refresh is performed. The Iceberg connector can act on this signal and use IncrementalAppendScan to scan the 'delta' records only and append them to the MV storage table (without truncation). Co-authored-by: Karol Sobczak <napewnotrafi@gmail.com> * Add Expression to Result in ExpressionAssertProvider * Add transform function const * Add JsonStringArrayExtract function Optimzie common case of transform(cast(json_parse(varchar_col) as array<json>), json -> json_extract_scalar(json, jsonPath)) case to be single and more performant call of JsonStringArrayExtract. * Expose lease distribution stats from BinPackingNodeAllocatorService * Update airlift to 249 * Fix product tests previously migrated to native fs to not use legacy fs * Add non-null check in SortingFileWriter * Allow adding and droping fields to records wrapped in array * Allow changing field type in records wrapped in an array * Refactor TestJsonStringArrayExtractScalar * Use List/Set injection into configuration This makes configuration more explicit in regard to expected types * Fix reading empty files when using native S3 FS * Fix configuration mistake * Allow reading empty files in native GCS FS * Workaround for reading empty files using native S3 FS When reading an empty file from S3 as a stream, using the AWS SDK v2, it incorrectly keeps reading the MD5 checksum included at the end of the response. This has been reported upstream as aws/aws-sdk-java-v2#3538 * Optimize IntegerNumberToDoubleCoercer#applyCoercedValue to avoid repeated reading of Block values * Add support for prefix configuration in Iceberg REST catalog REST catalog supports a `prefix` in the resource path as per the spec. But it is not exposed from Trino `IcebergRestCatalogConfig`. In Spark, this works by default because they accept generic property bag. * Rename getColumns to getTopLevelColumns in iceberg connector * Support partitioning on nested ROW fields in Iceberg Co-authored-by: Victoria Bukta <victoria.bukta@shopify.com> Co-authored-by: Yuya Ebihara <ebyhry@gmail.com> * Add Parquet Bloom filter write support to Iceberg connector * List unhealthy containers when product tests health check fails * Reschedule replacement tasks in case of artificial failure due to missing spooling output size The code introduced in trinodb#22298 was lacking rescheduling replacement tasks. As a result query execution got stuck after we observed missing spooling output stats for task. * Convert inner classes in ConstantPopulatingPageSource to record * Convert Iceberg PartitionColumn to record * Convert IcebergInputInfo to record * Convert IcebergPageSink.PartitionColumn to record * Convert Iceberg ReaderPageSourceWithRowPositions to record * Convert IcebergParquetColumnIOConverter.FieldContext to record * Convert IcebergPartitionColumn to record * Convert PartitionTransforms.ColumnTransform to record * Use field id in freeCurrentRowGroupBuffers for consistency * Remove interning from ColumnChunkProperties Avoids having unbounded and unaccounted memory usage. Also avoids the overheads associated with concurrent map look-up * Convert ColumnChunkProperties into a record * Removed unused code from ColumnChunkMetadata * Optimize ParquetReader#getColumnChunkMetaData Avoid unnecessary conversion to ColumnPath Compare arrays in reverse to find mismatch quicker * Optimize ParquetTypeUtils#getDescriptors Existing logic was complex due to performing case-insensitive matching. This was unnecesary because fileSchema and requestedSchema already contain lower cased names. Also, since requestedSchema is derived from fileSchema, we can build descriptors map directly from result of getColumns instead of repeating look-ups in fileSchema. * Update jjwt to 0.12.6 * Update minio to 8.5.11 * Update RoaringBitmap to 1.1.0 * Update AWS SDK v1 to 1.12.749 * Update AWS SDK v2 to 2.26.8 * Update oauth2-oidc-sdk to 11.13 * Update metrics-core to 4.2.26 * Update Nessie to 0.91.2 * Update openlineage to 1.17.1 * Update Azure SDK Azure SDK version 1.2.24 includes Azure Storage Blob 12.26, which introduces a breaking change, where blob names used to create a new blob client should not be URL-encoded anymore. Blob URLs in the storage don't change, so after adjusting the API usage, data written by older Trino versions should be readable by new Trino versions. * Disable caching of /ui responses This ensures that any sensitive information is not cached in the browser * Allow adding non-temurin JDK distributions This is a preparation for testing additional builds like Loom EA * Add Loom EA build * Verify ppc64le docker image * Add clarifying javadoc * Add more http proxy configs for native S3 filesystem implementation To make the http proxy config options in parity with legacy implementation * Push filter through TopNRankingNode * Fix initial value of diagnostics counter * Fix handling RESERVED status when updating stats * Allow uppercase characters in Iceberg quoted identifiers partition * Add special characters missing test coverage for native-fs s3 path * Use enhanced instanceof in KuduClientSession * Fix failure when filter operation is applied on varbinary column in Kudu * Remove enable-coordinator-dynamic-filters-distribution config This has been enabled by default for a long time without issues. It is no longer necessary to keep it as a configurable. * Migrate Hive view in Delta to integration tests * Migrate unregistering non-delta table to integration tests * Throw TrinoException for Kudu failures Inserting/Merging/Updating invalid data in Kudu doesn't throw a Kudu specific exception so we wrap all the exceptions as Trino specific exceptions * Add date data type support for Kudu Co-authored-by: Pascal Gasp <pascal.gasp@gmail.com> * Add t_cdf and t_pdf scalar functions * Fix spelling errors in documentation * Fix incorrect case optimization Expressions of the form: CASE WHEN <cond> THEN true ELSE false END were being simplified to <cond>, which is incorrect due to null handling. The original form would return false, while the simplified form returns null. * Allow pre-allocating memory per catalog on initialization * Add Trino 451 release notes * [maven-release-plugin] prepare release 451 * Merge fixes for Trino 451 (#35) --------- Co-authored-by: praveenkrishna.d <praveenkrishna@tutanota.com> Co-authored-by: Yiqun Zhang <guiyanakuang@gmail.com> Co-authored-by: Piotr Findeisen <piotr.findeisen@gmail.com> Co-authored-by: Yuya Ebihara <ebyhry@gmail.com> Co-authored-by: regadas <oss@regadas.email> Co-authored-by: Manfred Moser <manfred@simpligility.ca> Co-authored-by: Colebow <cole.bowden@starburstdata.com> Co-authored-by: uditkumar <29udit@gmail.com> Co-authored-by: Łukasz Osipiuk <lukasz@osipiuk.net> Co-authored-by: Rafał Połomka <polomek@gmail.com> Co-authored-by: Ashhar Hasan <ashhar.hasan@starburstdata.com> Co-authored-by: Jack Klamer <jack.klamer@starburstdata.com> Co-authored-by: Sasha Sheikin <myminitrue@gmail.com> Co-authored-by: Mateusz "Serafin" Gajewski <mateusz.gajewski@gmail.com> Co-authored-by: Raunaq Morarka <raunaqmorarka@gmail.com> Co-authored-by: Anu Sudarsan <anu.at.infy@gmail.com> Co-authored-by: takezoe <takezoe@gmail.com> Co-authored-by: Kamil Endruszkiewicz <kamil.endruszkiewicz@starburstdata.com> Co-authored-by: Lukasz Stec <lukasz.stec@starburstdata.com> Co-authored-by: chenjian2664 <chenjian2664@gmail.com> Co-authored-by: Mayank Vadariya <48036907+mayankvadariya@users.noreply.github.com> Co-authored-by: Alexey Pavlenko <the.sylph@gmail.com> Co-authored-by: Karol Sobczak <napewnotrafi@gmail.com> Co-authored-by: Dejan Mircevski <dejan.mircevski@starburstdata.com> Co-authored-by: Jan Waś <jan.was@starburstdata.com> Co-authored-by: Grant Nicholas <grant.nicholas@starburstdata.com> Co-authored-by: Keith Whitley <whitleykeith@github.com> Co-authored-by: ajantha-bhat <ajanthabhat@gmail.com> Co-authored-by: Grzegorz Nowak <augurpl@gmail.com> Co-authored-by: Martin Traverso <mtraverso@gmail.com> Co-authored-by: Jianjian <jja725@gmail.com> Co-authored-by: senlizishi <weixubin814@gmail.com> Co-authored-by: Alexander Kolesnikov <alekkol89@gmail.com> Co-authored-by: Gontzal Monasterio <go.monasterio@gmail.com> Co-authored-by: Marius Grama <findinpath@gmail.com> Co-authored-by: Star Poon <ypoon@lycorp.co.jp> Co-authored-by: Marton Bod <marton.bod@gmail.com> Co-authored-by: Brad <bradley.b.pitt@gmail.com> Co-authored-by: Vikash Kumar <vikash.kumar@starburstdata.com> Co-authored-by: Victoria Bukta <victoria.bukta@shopify.com> Co-authored-by: Jonas Irgens Kylling <jonas@dune.xyz> Co-authored-by: kasiafi <30203062+kasiafi@users.noreply.github.com> Co-authored-by: Pascal Gasp <pascal.gasp@gmail.com> Co-authored-by: Emily Sunaryo <emily.sunaryo@starburstdata.com> Co-authored-by: neriya shulman <45920392+neriyashul@users.noreply.github.com>
Description
In the Iceberg connector, currently we do a full refresh of MVs in all cases, meaning we drop all the existing data from the MV storage table and then insert all the data again.
When possible, we could be smarter during refresh. While incrementally refreshing a MV can be tricky for many operation types such as join, aggregations or window functions, there are some easier wins such as single-table, predicate-only queries. More complicated cases can be tackled in subsequent implementation phases.
In this first phase of implementation, this PR introduces incremental refresh with the following limitations:
(Basically nothing that would require touching/rewriting the already persisted data in the MV.)
For example, a type of MV supported for incremental refresh in this phase:
Approach:
LogicalPlanner
checks the plan tree and suggests either incremental or full refresh to the connectorIcebergMetadata#beginRefreshMaterializedView()
takes this information from the engine, checks a few additional things (e.g. presence of fromSnapshot for all base tables) and makes the final decisionIcebergSplitManager
creates the table scan, it would either create aTableScan
(full refresh), or anIncrementalAppendScan
(incremental refresh), using the fromSnapshot value. It can still revert back to full refresh in case the fromSnapshot is already expired or the refresh window contains any deletes/overwrites.IcebergMetadata#finishRefreshMaterializedView()
and perform an append-only commit.Additional context and related issues
Solves issue partially: #18673
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
() Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: