Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/adbc/driver/snowflake: add use_vectorized_scanner flag to bulk ingest #2005

Closed
sfc-gh-xhuang opened this issue Jul 11, 2024 · 1 comment · Fixed by #2025
Closed

go/adbc/driver/snowflake: add use_vectorized_scanner flag to bulk ingest #2005

sfc-gh-xhuang opened this issue Jul 11, 2024 · 1 comment · Fixed by #2025
Labels
Type: enhancement New feature or request

Comments

@sfc-gh-xhuang
Copy link

sfc-gh-xhuang commented Jul 11, 2024

What feature or improvement would you like to see?

https://github.com/apache/arrow-adbc/blob/main/go/adbc/driver/snowflake/bulk_ingestion.go#L48C2-L48C164

Change to
createTemporaryStageStmt = "CREATE OR REPLACE TEMPORARY STAGE " + bindStageName + " FILE_FORMAT = (TYPE = PARQUET USE_LOGICAL_TYPE = TRUE BINARY_AS_TEXT = FALSE USE_VECTORIZED_SCANNER=TRUE REPLACE_INVALID_CHARACTERS = TRUE)"

See details about new option: https://medium.com/snowflake/faster-parquet-data-ingestion-with-snowflake-use-vectorized-scanner-28679bcff450

Temporary internal stages have SNOWFLAKE_FULL encryption by default which is not yet supported by USE_VECTORIZED_SCANNER. However it will be supported in the next month, at which point, the vectorized scanner performance improvement will kick in.

It's still possible to set this option now but it won't take effect as it will fall back to the old scanner. When the encryption support is added, it will work automatically.

@sfc-gh-xhuang sfc-gh-xhuang added the Type: enhancement New feature or request label Jul 11, 2024
@ianmcook
Copy link
Member

@zeroshade

@lidavidm lidavidm added this to the ADBC Libraries 14 milestone Jul 12, 2024
@lidavidm lidavidm changed the title Snowflake create stage to include use_vectorized_scanner go/adbc/driver/snowflake: add use_vectorized_scanner flag to bulk ingest Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants