Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connector CI: Update nightlies to write to GCS #26929

Merged
merged 15 commits into from
Jun 6, 2023

Conversation

bnchrch
Copy link
Contributor

@bnchrch bnchrch commented Jun 1, 2023

What

We need our nightly builds to output test results to a different path

How

  1. We added a new bucket airbyte-connector-integration-reports
  2. We updated all connector commands to output to GCS
  3. We updated the report path to be constructed from the CI context

Example output file from a local run
https://storage.googleapis.com/airbyte-connector-integration-reports/test/manual/bnchrch_dagger_update-nightlies/1685688402/f63bd2708013d1cec9d9051b4c9b4328b9bcafff/source-marketo/1.1.0/output.json

closes: #26397

@bnchrch bnchrch requested a review from alafanechere June 1, 2023 23:59
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Octavia Squidington III seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@@ -357,6 +360,18 @@ def metadata(self) -> dict:
def docker_image_from_metadata(self) -> str:
return f"{self.metadata['dockerRepository']}:{self.metadata['dockerImageTag']}"

@property
def ci_gcs_credentials_secret(self) -> Secret:
# TODO (ben): Update this to be in use ANYWHERE we use a service account.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere I didnt want to add yet another service account env var. It felt like we were going down a bad way there.

Im thinking that we should enforce one service account for the entire pipeline. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere Heres a draft of that change: #26931

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it'd be much better to have a SA for airbyte-ci connectors that can do the thing it has to do for metadata upload, report upload etc...

workflow_dispatch:
inputs:
runs-on:
type: string
default: dev-large-runner
required: true
test-connectors-options:
default: --concurrency=5 --release-stage=generally_available --release-stage=beta
default: --concurrency=5 --release-stage=generally_available
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere I was noticing some flakyness even when only running GA connectors.

Im thinking we start the nightlies off with GA, then if they match the current nightlies we expand to beta

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to successfully run a couple of nightlies on both GA and beta so I think we should keep it like it is now.

@bnchrch bnchrch force-pushed the bnchrch/dagger/update-nightlies branch from 46e4598 to b8f7b55 Compare June 2, 2023 21:33
workflow_dispatch:
inputs:
runs-on:
type: string
default: dev-large-runner
required: true
test-connectors-options:
default: --concurrency=5 --release-stage=generally_available --release-stage=beta
default: --concurrency=5 --release-stage=generally_available
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to successfully run a couple of nightlies on both GA and beta so I think we should keep it like it is now.

@@ -31,7 +28,7 @@ jobs:
METADATA_SERVICE_BUCKET_NAME: prod-airbyte-cloud-connector-metadata-service
SPEC_CACHE_BUCKET_NAME: io-airbyte-cloud-spec-cache
SPEC_CACHE_GCS_CREDENTIALS: ${{ secrets.SPEC_CACHE_SERVICE_ACCOUNT_KEY_PUBLISH }}
TEST_REPORTS_BUCKET_NAME: "airbyte-connector-build-status"
TEST_REPORTS_BUCKET_NAME: "airbyte-connector-integration-reports"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might reuse this bucket for non connector use case?

Suggested change
TEST_REPORTS_BUCKET_NAME: "airbyte-connector-integration-reports"
TEST_REPORTS_BUCKET_NAME: "airbyte-ci-reports"

cmd = ctx.invoked_subcommand

path_values = [
cmd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

AWS_SECRET_ACCESS_KEY: ${{ secrets.STATUS_API_AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: "us-east-2"
TEST_REPORTS_BUCKET_NAME: "airbyte-connector-build-status"
TEST_REPORTS_BUCKET_NAME: "airbyte-connector-integration-reports"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TEST_REPORTS_BUCKET_NAME: "airbyte-connector-integration-reports"
TEST_REPORTS_BUCKET_NAME: "airbyte-ci-reports"

I believe we could use this bucket for other use cases. I'd suggest to use a common root path like connectors to group all connectors related reports;

@@ -55,11 +55,46 @@ def validate_environment(is_local: bool, use_remote_secrets: bool):
)


def render_report_output_prefix(ctx: click.Context) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this function can be a method of the Report class, given you'd pass the invoked_command to our PipelineContext class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this in as I think its a good idea for the airbyte-ci to set the top level path for all airbyte-ci related outputs.

This lets us standardize the path in both the pipelines and any utilities we might make.

@@ -357,6 +360,18 @@ def metadata(self) -> dict:
def docker_image_from_metadata(self) -> str:
return f"{self.metadata['dockerRepository']}:{self.metadata['dockerImageTag']}"

@property
def ci_gcs_credentials_secret(self) -> Secret:
# TODO (ben): Update this to be in use ANYWHERE we use a service account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it'd be much better to have a SA for airbyte-ci connectors that can do the thing it has to do for metadata upload, report upload etc...


@property
def test_report_bucket(self) -> str:
bucket = os.environ.get("TEST_REPORTS_BUCKET_NAME") # TODO (ben): move to config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we move it to the context instance? (I believe its what you mean by config)

git_branch = self.report.pipeline_context.git_branch.replace("/", "_")
suffix = f"{connector_name}/{git_branch}/{connector_version}/{git_revision}.json"
local_report_path = Path(local_reports_path_root + suffix)
connector_name = self.connector.technical_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to get the proper reference I made sure to use the context attached to the report to retrieve these metadata.

Comment on lines 418 to 425
connector_name = self.connector.technical_name
connector_version = self.connector.version

suffix = f"{connector_name}/{connector_version}/output.json"
file_path_key = f"{self.report_output_prefix}/{suffix}"

local_report_path = Path(local_reports_path_root + file_path_key)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shall we create a dedicated method that would generate GCS and local path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im thinking here that we mandate that the relative path stays the same remote vs local

Then the things are configurable are

  1. What bucket to write to if remote
  2. What root folder to write to if tmp

Keeps the path to report consistent between both

if self.report.should_be_saved:
s3_key = self.s3_report_key + suffix
report_upload_exit_code = await remote_storage.upload_to_s3(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a soft roll out could we keep uploading reports to both S3 and GCS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere as I understand this is due to not wanting to break the dashboard correct?

I think no matter what those get broken with the new paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So perhaps we just go over to gcs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep the "old" path for S3 to not break the dashboard.

@bnchrch bnchrch enabled auto-merge (squash) June 6, 2023 16:30
@bnchrch bnchrch merged commit 2df72cc into master Jun 6, 2023
@bnchrch bnchrch deleted the bnchrch/dagger/update-nightlies branch June 6, 2023 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update nightlies to use dagger test pipeline
3 participants