Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-480] seed: Non-atomic update of seed tables #135

Closed
adamantike opened this issue Apr 12, 2022 · 2 comments
Closed

[CT-480] seed: Non-atomic update of seed tables #135

adamantike opened this issue Apr 12, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@adamantike
Copy link

adamantike commented Apr 12, 2022

Describe the bug

When running dbt seed, tables seem to be updated non-atomically, so there's a brief period of time when tables are empty. This could break queries or workflows that are expected to run at the same time as dbt seed.

Steps To Reproduce

  1. Copy the contents of this gist, which includes requirements.txt, dbt_project.yml, and profiles.yml. Configure Snowflake credentials in the profiles.yml file.
  2. Create a data folder, and generate a seeds file with many lines (I have been able to reproduce this issue with small files, but having bigger ones makes reproducibility easier as the table remains empty for more time, while INSERTs are being executed):
    mkdir -p data/ && \
    	echo "id" > data/test__ids.csv && \
    	for i in `seq 1 1000`; do uuidgen >>data/test__ids.csv; done
  3. In a separate shell, trigger the following command to monitor the content of the seed table. Alternatively, the Snowflake web interface can be used to run many SELECTs one after another.
    while :; do snowsql -o friendly=false -q 'SELECT * FROM "TEST__IDS" LIMIT 1'; done
  4. Run the dbt seed command in a separate shell:
    dbt seed --profiles-dir .

Expected behavior

The expected behavior would be for the truncate and insert over the seed tables to run atomically, so queries that run at the same time that the dbt seed command not to receive an empty result set.

Screenshots and log output

Running the SnowSQL command while dbt seed will show that the table is empty sometimes:

$ while :; do snowsql -o friendly=false -q 'SELECT * FROM "TEST__IDS" LIMIT 1'; done
+--------------------------------------+
| ID                                   |
|--------------------------------------|
| ef4fdfb6-38ee-4b48-b38b-20fbaedc45be |
+--------------------------------------+
1 Row(s) produced. Time Elapsed: 0.268s
+----+
| ID |
|----|
+----+
0 Row(s) produced. Time Elapsed: 0.402s
+--------------------------------------+
| ID                                   |
|--------------------------------------|
| ef4fdfb6-38ee-4b48-b38b-20fbaedc45be |
+--------------------------------------+
1 Row(s) produced. Time Elapsed: 0.382s

System information

The output of dbt --version:

installed version: 1.0.4
   latest version: 1.0.4

Up to date!

Plugins:
  - snowflake: 1.0.0 - Up to date!

The operating system you're using: Linux (Manjaro)

The output of python --version: Python 3.7.12 (virtualenv)

@adamantike adamantike added bug Something isn't working triage labels Apr 12, 2022
@github-actions github-actions bot changed the title seed: Non-atomic update of seed tables [CT-480] seed: Non-atomic update of seed tables Apr 12, 2022
@nathaniel-may
Copy link
Contributor

nathaniel-may commented Apr 19, 2022

Hi @adamantike, thanks for your very detailed report. It looks like there is another ticket outlining how to recreate the issue and observe duplicate data in #112. I'm going to close this ticket as a duplicate, so please feel free to join the discussion over there as well.

@nathaniel-may nathaniel-may added bug Something isn't working duplicate and removed bug Something isn't working triage labels Apr 19, 2022
@nathaniel-may
Copy link
Contributor

duplicate of #112

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants