Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1764301 Iceberg parquet file configuration #874

Merged
merged 5 commits into from
Oct 29, 2024

Conversation

sfc-gh-alhuang
Copy link
Contributor

This PR includes following change:

  1. Change default compression algorithm to ZSTD when streaming to Iceberg tables.
  2. Disable dictionary encoding parameter when use v1 parquet writer which does not support dictionary encoding.

@@ -89,6 +95,8 @@ public ClientBufferParameters(SnowflakeStreamingIngestClientInternal clientInter
clientInternal != null
? clientInternal.getInternalParameterProvider().isEnableValuesCount()
: InternalParameterProvider.ENABLE_VALUES_COUNT_DEFAULT;
this.enableDictionaryEncoding =
Copy link
Collaborator

@sfc-gh-hmadan sfc-gh-hmadan Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very satisfied by how there's now a proliferation of places that control runtime behavior :(
There's ParameterProvider, internalParameterProvider, clientBuferParameters, and some completely unnecesary defaulting for when clientInternal is null (it should never be not even in tests).

Tried to come up with a better answer but nothing that sticks and is relevant to your current PR.

As a followup whenever you have time i'd suggest first removing the null defaulting, remove the "isTestMode" thing from everywhere (fully doable after a round of changes I did for the ExternalVolume class). Lets see how things look after these two!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added jira. Will merge this after the #874

@sfc-gh-alhuang sfc-gh-alhuang force-pushed the alhuang-zstd branch 3 times, most recently from 0eb347a to 422ac86 Compare October 28, 2024 20:25
Base automatically changed from alhuang-zstd to master October 29, 2024 20:45
@sfc-gh-alhuang sfc-gh-alhuang enabled auto-merge (squash) October 29, 2024 21:32
@sfc-gh-alhuang sfc-gh-alhuang merged commit 70d4aa9 into master Oct 29, 2024
45 checks passed
@sfc-gh-alhuang sfc-gh-alhuang deleted the alhuang-parquet-config branch October 29, 2024 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants