-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot query views with FILTER and SELECT operators, throws invalid_argument exception #354
Comments
@davidrabinowitz could you please look into this, it is becoming seemingly difficult to query views using BigQuery Connector |
Hi Sandeep, Please try the following query: viewDf.filter($"date" === "2021-03-29 17:00:00").select("country", "date").show() |
Thanks for the workaround, is it a bug or limitation ? |
This is currently the limitation. In case of views the fields in the filter should also be part of select. |
Could we document this limitation, I don't find any referrence which says as a limitation |
Hi Sandeep, Sure, I will add this limitation in the document(readme). Also, the reason for this limitation is as follows: As mentioned here, the views have to be materialized before reading them. The materialized view is created with all the columns but if there is a select operator present then the materialized view is created with these columns only. Now, in this particular case the materialized view is created with a single column "country". As a result, when after materialization the filter operator is applied on column "date" it throws error saying "INVALID_ARGUMENT" as "date" column in not present in the materialized view. Hence, in case of views if we want to use both select and filter then the columns on which filter is applied those columns should also be present in select. |
…ew with different columns in select() and flter().
…ew with different columns in select() and flter().
Hi Sandeep, I have corrected this limitation and the PR has been merged. |
When I run
viewdf.filter.select
it throws INVALID_ARGUMENT: request failed: Row filter for table xxxx:tempdataset._sbc_79a9cc7eb80c419f94763550c7ead249 is invalid. Filter is '(date
= TIMESTAMP '2021-03-29 17:00:00.0' ANDdate
IS NOT NULL)'Code
I have created the view using the public dataset
bigquery-public-data.covid19_italy.data_by_province
Saved the output of below results as view in BigQuery
Logs
21/03/30 10:10:32 INFO DirectBigQueryRelation: Querying table XXXXX.tempdataset.covidconfirmview, parameters sent from Spark: requiredColumns=[country], filters=[isnotnull(date#515),(date#515 = 1617037200000000)]
21/03/30 10:10:34 INFO DirectBigQueryRelation: Going to read from XXXXX.tempdataset._sbc_79a9cc7eb80c419f94763550c7ead249 columns=[country], filter='(
date
= TIMESTAMP '2021-03-29 17:00:00.0' ANDdate
IS NOT NULL)'21/03/30 10:10:34 INFO DirectBigQueryRelation: unhandledFilters:
21/03/30 10:10:34 INFO DirectBigQueryRelation: Querying table XXXXX.tempdataset.covidconfirmview, parameters sent from Spark: requiredColumns=[country], filters=[isnotnull(date#515),(date#515 = 1617037200000000)]
21/03/30 10:10:34 INFO DirectBigQueryRelation: Going to read from XXXXX.tempdataset._sbc_79a9cc7eb80c419f94763550c7ead249 columns=[country], filter='(
date
= TIMESTAMP '2021-03-29 17:00:00.0' ANDdate
IS NOT NULL)'21/03/30 10:10:34 WARN SQLExecution: Error executing delta metering
com.google.api.gax.rpc.InvalidArgumentException: com.google.cloud.spark.bigquery.repackaged.io.grpc.StatusRuntimeException: INVALID_ARGUMENT: request failed: Row filter for table XXXXX:tempdataset._sbc_79a9cc7eb80c419f94763550c7ead249 is invalid. Filter is '(
date
= TIMESTAMP '2021-03-29 17:00:00.0' ANDdate
IS NOT NULL)'The text was updated successfully, but these errors were encountered: