-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quoting of table names #7666
Quoting of table names #7666
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are definitely more WDLs in GVS that don't use `s. Did you not run into issues while running those or are those outside the scope of this PR?
"UNION DISTINCT " \ | ||
"SELECT i.sample_name FROM items i WHERE i.sample_id IN (SELECT sample_id FROM ~{dataset_name}.sample_load_status) " \ | ||
| sed -e '/sample_name/d' > duplicates | ||
echo "WITH items as (SELECT s.sample_id, s.sample_name, s.is_loaded FROM \`${TEMP_TABLE}\` t left outer join \`${SAMPLE_INFO_TABLE}\` s on (s.sample_name = t.sample_name)) " >> query.sql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to adopt this approach of putting the SQL into a temp file and then run it so we could get around the mix of single-quotes, double-quotes, need for $ variable interpretation in bash and the backticks required for BQ.
@@ -433,7 +434,7 @@ task GetSampleIds { | |||
python3 -c "from math import ceil; print(ceil($min_sample_id/~{samples_per_table}))" > min_sample_id | |||
|
|||
bq --project_id=~{project_id} query --format=csv --use_legacy_sql=false -n ~{num_samples} \ | |||
"SELECT sample_id, samples.sample_name FROM ~{dataset_name}.~{table_name} AS samples JOIN ${TEMP_TABLE} AS temp ON samples.sample_name=temp.sample_name" > sample_map | |||
"SELECT sample_id, samples.sample_name FROM \`~{dataset_name}.~{table_name}\` AS samples JOIN \`${TEMP_TABLE}\` AS temp ON samples.sample_name=temp.sample_name" > sample_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of where we need $ interpolation (so we can't use single quotes) but also have the back-ticks to deal with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for going through all those WDLs!
The table names in GvsAssignId were not quoted with backticks, which is fine except if your dataset name starts with a number… which is a total valid identifier, but requires quoting.
Recently we had a customer (AoU) supply a dataset with the name
1kg_wgs
which exposed this problem