Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery Destination: Proposal to impersonate account on bigquery #15820

Closed

Conversation

marcelopio
Copy link
Contributor

@marcelopio marcelopio commented Aug 20, 2022

What

Proposal to solve #15726

How

Describe the solution

Recommended reading order

  1. x.java
  2. y.python

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@marcosmarxm
Copy link
Member

@marcelopio do you want to move this to ready to review?

@marcelopio marcelopio marked this pull request as ready for review October 18, 2022 13:59
@sajarin sajarin added internal and removed bounty labels Oct 20, 2022
@marcosmarxm
Copy link
Member

Hello 👋, first thank you for this amazing contribution.

We really appreciate the effort you've made to improve the project.
We ask you patience for the code review. Last month our team was focused on Hacktoberfest event and that probably left some PR without the proper feedback. And this week, due to the Thanksgiving US Holiday, most our team is out of office with their families. Another important piece of information why code won't be merge this week is: as a safety measure the core team has decided to freeze merging code to main branch to keep the release stable. Next week we'll return to you with the proper code review and update the status of your contribution.

If you have any questions feel free to send me a message in Slack!
Thanks!

@marcosmarxm
Copy link
Member

Sorry the delay here @marcelopio I'm start the process of reviewing it!

@marcosmarxm
Copy link
Member

/test connector=connectors/destination-bigquery

@marcosmarxm
Copy link
Member

marcosmarxm commented Dec 8, 2022

/test connector=connectors/destination-bigquery

🕑 connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/3651478687
✅ connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/3651478687
Python tests coverage:

Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 14      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    14      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     166      8    95%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_config/transform.py                         189     48    75%
normalization/transform_catalog/utils.py                             51     14    73%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 595    400    33%
-------------------------------------------------------------------------------------
TOTAL                                                              1441    629    56%

Build Passed

Test summary info:

All Passed

@marcosmarxm marcosmarxm requested a review from a team December 15, 2022 17:07
@marcosmarxm
Copy link
Member

@grishick can the database team review this contribution? CI is passing but no test was added.

@rodireich rodireich added the team/destinations Destinations team's backlog label Dec 15, 2022
@grishick
Copy link
Contributor

Thank you for the contribution. I'll have to take some time to set up credentials and accounts to add an E2E integration test for this feature before I can merge this change.

@grishick
Copy link
Contributor

@grishick can the database team review this contribution? CI is passing but no test was added.

this needs a new test case. To create that test case we need application credentials and a test account that allows impersonation. We'll need to add these credentials to the secrets storage, so that the test can run via CI

@marcosmarxm
Copy link
Member

Hello 👋:skin-tone-2: and thank you for your contribution!

Airbyte has instituted a code freeze between 19 and 30 December, to make sure there are no disruptions during the holidays.
Because of this, reviewing and merging your contribution may take longer than usual.
We apologize for the delay, but we want everyone to have a quiet and happy holiday.

If you have any questions or need further clarification, please don't hesitate to ping via Slack.

@grishick
Copy link
Contributor

@marcelopio could you share with me, how you tested this change?

@grishick grishick self-assigned this Dec 21, 2022
@marcelopio
Copy link
Contributor Author

You can test it in two ways:

To test without the application default credentials, just setup a service account that has permission to impersonate another service account.

To test with the default credentials, you need to test on a Google Cloud Environment. You can test on GCE adding a default service account to the VM, or on GKE, or on Cloud Shell

@grishick
Copy link
Contributor

I've opened another PR where I added tests: #20788

@grishick
Copy link
Contributor

grishick commented Dec 22, 2022

@marcelopio I have been testing these changes and I from what I can see, the changes do not apply to GCS Staging. So, if a connector is configured to impersonate another account, it will not be able to use GCS Staging, unless it is also provided with non-impersonated GCS credentials. Could you please confirm this?

@marcelopio
Copy link
Contributor Author

Oh, damn, that is right...

Even if the HMAC Key is generated for the impersonated account?

But the whole use of the HMAC may defeat the purpose of this for GCS. Since the idea is having just the default key for the machine and not provide a key.

@marcelopio
Copy link
Contributor Author

Would Airbyte accept a change adding native use of GCS, not passing through S3 compatibility layer?

@grishick
Copy link
Contributor

Would Airbyte accept a change adding native use of GCS, not passing through S3 compatibility layer?

I cannot say we are too attached to the S3 compatibility layer, however, the change to use GCS natively may be pretty large. Would switching to native GCS remove the need for HMAC key?

@marcelopio
Copy link
Contributor Author

Yes, since we could use the application default credentials to connect to GCS also. I will see how much work is needed to do that then. It will be a fun project :)

@grishick
Copy link
Contributor

Yes, since we could use the application default credentials to connect to GCS also. I will see how much work is needed to do that then. It will be a fun project :)

SGTM. FYI, I added test cases for impersonation in this PR: #20788 I will likely move all my test refactoring code out of that PR and close it. Once this PR is ready to impersonate both BigQuery and GCS accounts, then I'll add those new test cases back here.

@subodh1810 subodh1810 requested review from a team and removed request for a team February 6, 2023 08:55
@sh4sh
Copy link
Contributor

sh4sh commented Mar 7, 2023

@marcelopio @grishick checking in, is there a status update on this PR? It looks like #20851 has been merged, but it's not clear to me whether anything else needs to be done first.

@marcelopio
Copy link
Contributor Author

Hey! Sorry, but life got in the way, haha. Things are getting better but I don't hope to get back at this at least until next month.

@evantahler evantahler assigned evantahler and sh4sh and unassigned grishick Mar 30, 2023
@evantahler
Copy link
Contributor

@marcelopio I'm going to close this PR as it's been a little while since we last heard from you. Your solution looks good though, so please reopen this PR when you are ready!

@evantahler evantahler closed this Apr 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants