[ETL-677] Add bootstrap script for Snowflake Parquet tables #133

philerooski · 2024-07-23T19:56:25Z

This script will load data into each of our Parquet tables one by one. It's expected to take a few hours to run on an xsmall warehouse.

You can test this script yourself by creating a testing branch from this branch, pushing your new branch (triggering the Snowflake deployment), copying dev Parquet data from the main namespace over to your namespace in S3 s3://recover-dev-processed-data/my_test_branch/parquet/, and then using the Snowflake CLI to invoke this script:

snow sql -f snowflake/scripts/copy_into_each_parquet_table.sql -D "database_name=recover_my_test_branch" -D "schema_name=parquet"

It's important to note that the current behavior of this script and the stored procedure which it invokes is meant to fulfill the needs of doing a one-time initial load of our Parquet data for demonstration/sandbox purposes. Because our Parquet datasets are overwritten in S3, running this script on different days would load duplicate records – although we wouldn't have a way to differentiate them (e.g., there is no field indicating the Glue workflow which produced them or any other indicator of their ordinality).

sonarqubecloud · 2024-07-23T19:56:45Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

thomasyu888

🔥 LGTM!

Add bootstrap script for Snowflake Parquet tables

46fae46

philerooski requested a review from a team as a code owner July 23, 2024 19:56

philerooski temporarily deployed to develop July 23, 2024 20:00 — with GitHub Actions Inactive

thomasyu888 approved these changes Jul 24, 2024

View reviewed changes

BryanFauble approved these changes Jul 24, 2024

View reviewed changes

philerooski merged commit 281a569 into main Jul 25, 2024
17 checks passed

philerooski deleted the etl-677 branch July 25, 2024 17:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ETL-677] Add bootstrap script for Snowflake Parquet tables #133

[ETL-677] Add bootstrap script for Snowflake Parquet tables #133

philerooski commented Jul 23, 2024 •

edited

Loading

sonarqubecloud bot commented Jul 23, 2024

thomasyu888 left a comment

[ETL-677] Add bootstrap script for Snowflake Parquet tables #133

[ETL-677] Add bootstrap script for Snowflake Parquet tables #133

Conversation

philerooski commented Jul 23, 2024 • edited Loading

sonarqubecloud bot commented Jul 23, 2024

Quality Gate passed

thomasyu888 left a comment

Choose a reason for hiding this comment

philerooski commented Jul 23, 2024 •

edited

Loading