Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Destination GCS: CMEK is supported for the connector #20351

Conversation

armsepehr
Copy link

@armsepehr armsepehr commented Dec 12, 2022

What

Describe what the change is solving
This branch may fix the following issue: GCS destination is not supported Customer-Managed Encryption Key and it is mentioned in the readme file for the destination.

  1. Destination GCS: Support buckets using customer-managed encryption key #18195
  2. Destinations Bigquery and GCS: require google-managed encryption key #18315

The config file has a missing field which is misleading me for several hour, so I have included the field and some sample-files to check the destination commands as well as the integration tests easily.

How

Describe the solution
As it is discussed in the issue there are two main methods to solve these issues: First, rewrite GCS without AmazonS3 base class. Second, may set integrity check only for GCS to false. In this branch, the second solution is selected.

Note that at the current implementation, google-cloud-storage has been connected to the server via amazon s3 classes and they have some difference.

Recommended reading order

  1. x.java
  2. y.python

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

The change guarantees that the users can send their data to the gcs bucket which is enable by customer-managed encryption key feature.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@armsepehr armsepehr requested a review from a team as a code owner December 12, 2022 01:54
@CLAassistant
Copy link

CLAassistant commented Dec 12, 2022

CLA assistant check
All committers have signed the CLA.

@marcosmarxm
Copy link
Member

@armsepehr please add the checklist you removed from the template and fill the tasks done.
@armsepehr please sign the CLA.

@armsepehr armsepehr force-pushed the 18195-CMEK-Support-for-GCS-Destination-Connector branch from 6de821c to 98b96b2 Compare December 13, 2022 03:07
@armsepehr armsepehr changed the title BUGS: CMEK is supported for the GCS destination connector 🐛 Destination GCS: CMEK is supported for the connector Dec 13, 2022
@armsepehr
Copy link
Author

armsepehr commented Dec 15, 2022

@sajarin @octavia-squidington-iv @marcosmarxm

Should I do anything more?

Comment on lines 1 to 8
{
"streams": [
{
"sync_mode": "full_refresh",
"destination_sync_mode": "append",
"stream": {
"name": "ab-airbyte-testing",
"supported_sync_modes": ["full_refresh"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you adding this file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is necessary to verify the "write" command locally. I see the same file in other connector, so I included in gcs too.

Comment on lines 9 to 10
},
"format": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to this change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to this change, but I think the creator missed to include the format in "config.json" file. I waste some time to figure out the sample "config.json" has not the format field which is necessary. It give us some strange error message if you do not include the "format" field. In a word, I think it is better to include "format" field in "config.json" file.

Comment on lines 1 to 2
{"type": "RECORD", "record": {"stream": "ab-airbyte-testing", "data": {"_ab_pk": "my_value", "column2": 221, "column3": "2021-01-01T20:10:22", "column4": 1.214, "column5": [1,2,3]}, "emitted_at": 1626172757000}}
{"type": "RECORD", "record": {"stream": "ab-airbyte-testing", "data": {"_ab_pk": "my_value2", "column2": 222, "column3": "2021-01-02T22:10:22", "column5": [1,2,null]}, "emitted_at": 1626172757000}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said before, this file was included in other connectors and I think it's necessary to verify the connector locally. In my opinion, it is better to include in the project same as other connector.

@marcosmarxm
Copy link
Member

marcosmarxm commented Dec 15, 2022

/test connector=connectors/destination-bigquery

🕑 connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/3706802785
✅ connectors/destination-bigquery https://github.com/airbytehq/airbyte/actions/runs/3706802785
Python tests coverage:

Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 14      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    14      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     166      8    95%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_config/transform.py                         189     48    75%
normalization/transform_catalog/utils.py                             51     14    73%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 595    400    33%
-------------------------------------------------------------------------------------
TOTAL                                                              1441    629    56%

Build Passed

Test summary info:

All Passed

@sajarin sajarin added internal and removed bounty labels Dec 19, 2022
@marcosmarxm
Copy link
Member

Hello 👋:skin-tone-2: and thank you for your contribution!

Airbyte has instituted a code freeze between 19 and 30 December, to make sure there are no disruptions during the holidays.
Because of this, reviewing and merging your contribution may take longer than usual.
We apologize for the delay, but we want everyone to have a quiet and happy holiday.

If you have any questions or need further clarification, please don't hesitate to ping via Slack.

@grishick
Copy link
Contributor

grishick commented Jan 20, 2023

/test connector=connectors/destination-gcs

🕑 connectors/destination-gcs https://github.com/airbytehq/airbyte/actions/runs/3969626147
❌ connectors/destination-gcs https://github.com/airbytehq/airbyte/actions/runs/3969626147
🐛 https://gradle.com/s/os26adizlklgg

Build Failed

Test summary info:

Could not find result summary

@grishick grishick force-pushed the 18195-CMEK-Support-for-GCS-Destination-Connector branch from 710500a to 215380c Compare January 20, 2023 18:42
@grishick
Copy link
Contributor

Rebased, bumped version, and running tests here: #21682

grishick added a commit that referenced this pull request Jan 24, 2023
Community PR #20351: Support CMEK or the GCS destination connector (#21682)
@grishick grishick closed this Jan 24, 2023
@grishick grishick force-pushed the 18195-CMEK-Support-for-GCS-Destination-Connector branch from 215380c to b7be7aa Compare January 24, 2023 20:50
@octavia-squidington-iv octavia-squidington-iv removed the area/connectors Connector related issues label Jan 24, 2023
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 21:33 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 22:18 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 22:39 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 22:51 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 23:04 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 23:08 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 23:20 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 23:22 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 23:23 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 23:35 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 24, 2023 23:41 — with GitHub Actions Inactive
@airbyteio airbyteio temporarily deployed to more-secrets January 25, 2023 00:18 — with GitHub Actions Inactive
@armsepehr armsepehr deleted the 18195-CMEK-Support-for-GCS-Destination-Connector branch January 25, 2023 05:49
@armsepehr armsepehr restored the 18195-CMEK-Support-for-GCS-Destination-Connector branch January 25, 2023 05:50
@armsepehr armsepehr deleted the 18195-CMEK-Support-for-GCS-Destination-Connector branch January 30, 2023 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants