Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_grant_dataset_view_access fails if view already exists #35795

Closed
1 of 2 tasks
TobiasHammarstrom opened this issue Nov 22, 2023 · 4 comments
Closed
1 of 2 tasks

run_grant_dataset_view_access fails if view already exists #35795

TobiasHammarstrom opened this issue Nov 22, 2023 · 4 comments
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:google Google (including GCP) related issues

Comments

@TobiasHammarstrom
Copy link

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

The function run_grant_dataset_view_access does not correctly check if the view already exists, breaking what's specified in the docs

If this view has already been granted access to the dataset, do nothing

What you think should happen instead

The function should skip trying to create the view if it already exists

How to reproduce

Use GCP Composer, create a DAG with a BigQueryHook, call run_grant_dataset_view_access to grant a view in dataset A access to dataset B, but the view already has access to dataset B.

Operating System

composer-2.5.1-airflow-2.6.3

Versions of Apache Airflow Providers

The ones included in the above installation (the provider in question is google-cloud-bigquery==3.12.0)

Deployment

Google Cloud Composer

Deployment details

No response

Anything else

Investigation:

The logs include "Granting table xxx authorized view access to xxx dataset.", which indicates that this if-statement is True.

When running a python script locally with the same version of google-cloud-bigquery I found that the AccessEntry object fetched from the dataset does not match the created AccessEntry object. The mismatch occurs in the _properties dict, where the fetched AccessEntry only contains one entry view whereas the created one contains two entries, view and role. The following script shows the problem

from google.cloud import bigquery
from google.cloud.bigquery.dataset import AccessEntry
from copy import deepcopy

def main():
    client = bigquery.Client(project=<project>, location=<location>)
    dataset = client.get_dataset(<dataset>)
    access_entries = dataset.access_entries 

    view_access = AccessEntry(
        role=None, 
        entity_type="view", 
        entity_id={
            "projectId": <project>, 
            "datasetId": <dataset>, 
            "tableId": <table>
        })

    view_access_no_role = AccessEntry(
        role=None, 
        entity_type="view", 
        entity_id={
            "projectId": <project>, 
            "datasetId": <dataset>, 
            "tableId": <table>
        })
    del view_access_no_role._properties['role']


    isInAccessEntries_v1 = view_access in access_entries #False
    isInAccessEntries_v2 = view_access_no_role in access_entries #True

    existing_view_access = access_entries[<index of existing view>]
    existing_view_access_fixed = deepcopy(existing_view_access)
    existing_view_access_fixed._properties['role'] = None
    access_entries.append(existing_view_access_fixed)

    isInAccessEntries_v3 = view_access in access_entries #True

if __name__ == "__main__":
    main()

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@TobiasHammarstrom TobiasHammarstrom added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Nov 22, 2023
Copy link

boring-cyborg bot commented Nov 22, 2023

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@Taragolis Taragolis added area:providers provider:cncf-kubernetes Kubernetes provider related issues provider:google Google (including GCP) related issues and removed area:core provider:cncf-kubernetes Kubernetes provider related issues labels Nov 22, 2023
@potiuk potiuk added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Nov 22, 2023
@kadai0308
Copy link
Contributor

Hi I am Kadai, new to here and looking forward to contribute. After go through the description and source code, i think i can fix this issue. Can you please assign this issue to me? Thank you.

@kadai0308
Copy link
Contributor

After digging deeper, the issue already been fixed in google-cloud-bigquery version 3.13 MR

@eladkal
Copy link
Contributor

eladkal commented Dec 15, 2023

After digging deeper, the issue already been fixed in google-cloud-bigquery version 3.13 googleapis/python-bigquery#1682

I see that google-cloud-bigquery==3.13.0 already present in
https://raw.githubusercontent.com/apache/airflow/constraints-2.7.3/constraints-3.8.txt

Thus closing this one as no issue

@eladkal eladkal closed this as not planned Won't fix, can't repro, duplicate, stale Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:google Google (including GCP) related issues
Projects
None yet
Development

No branches or pull requests

5 participants