We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When running the following lines
file_dict=$(python3 scripts/python/get_csv_filepaths.py path/to/vocab/csvs) dbt run-operation load_data_duckdb --args "{file_dict: $file_dict, vocab_tables: true}"
I get multiple errors with the run-operation and had to patch the macros/load_data_duckdb.sql as follows:
run-operation
macros/load_data_duckdb.sql
{% for n, p in file_dict.items() %} {% do run_query("DROP TABLE IF EXISTS " ~ target_schema ~ "." ~ n.lower() ~ ";") %} {% do run_query("CREATE TABLE IF NOT EXISTS " ~ target_schema ~ "." ~ n.lower() ~ " AS SELECT * FROM read_csv('" ~ p ~ "', quote = \'\"\', delim = '\t', ignore_errors = true);") %} {% endfor %}
Otherwise, I see errors like this:
(test) thecedarprince@thecedarledge:~/FOSS/dbt-synthea$ dbt run-operation load_data_duckdb --args "{file_dict: $file_dict, vocab_tables: true}" 01:41:34 Running with dbt=1.8.7 01:41:34 Registered adapter: duckdb=1.8.0 01:41:35 Found 85 models, 29 seeds, 425 data tests, 29 sources, 537 macros 01:41:39 Encountered an error while running operation: Runtime Error Invalid Input Error: CSV Error on Line: 245521 Original Line: 45883506 'Baby' or low-dose aspirin Meas Value LOINC Answer S LA14345-5 19700101 20991231 Value with unterminated quote found. Possible fixes: * Enable ignore errors (ignore_errors=true) to skip this row * Set quote do empty or to a different value (e.g., quote='') file=/home/thecedarprince/Programs/SYNTHEA/vocabulary/CONCEPT.csv delimiter = (Auto-Detected) quote = ' (Auto-Detected) escape = \ (Auto-Detected) new_line = \n (Auto-Detected) header = true (Auto-Detected) skip_rows = 0 (Auto-Detected) comment = \0 (Auto-Detected) date_format = (Auto-Detected) timestamp_format = (Auto-Detected) null_padding=0 sample_size=20480 ignore_errors=false all_varchar=0
Note: even with the patching, I still face errors later when loading vocabularies into DuckDB so this patching doesn't really work.
The text was updated successfully, but these errors were encountered:
I've never used this macro @TheCedarPrince - but I have used duckdb/postgres a lot to do this and spent AGES trying to figure out how to do this.
Turns out all you need to do is turn off quote parsing - i.e.
read_csv('" ~ p ~ "', quote = '')
and it should work neatly from memory
Sorry, something went wrong.
Yup! That fixed it on my side! I can make a PR later today unless if you beat me first @lawrenceadams !
You fire on! You caught it! Glad to hear
Closed thanks to #93
No branches or pull requests
When running the following lines
I get multiple errors with the
run-operation
and had to patch themacros/load_data_duckdb.sql
as follows:Otherwise, I see errors like this:
Note: even with the patching, I still face errors later when loading vocabularies into DuckDB so this patching doesn't really work.
The text was updated successfully, but these errors were encountered: