Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci_connector_ops: POC of CI connector pipelines in python #23487

Merged
merged 104 commits into from
Mar 10, 2023

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Feb 27, 2023

What

Closes #23504
This is an attempt at rewriting airbyte-python.gradle and airbyte-connector-acceptance-test.gradle in Python (with the use of Dagger).
The motivation behind this work can be found here

I apologize for the size of the PR but I'd like to use it as a common playground for Dagger team and us to iterate on best practices and optimization they could suggest.

How

  • Declare a new pipelines subpackage in ci_connector_ops
  • Expose a connectors-ci command that can run CI pipelines on one our multiple connectors and according to code change:
connectors-ci test-connectors --name=source-pokeapi --name=source-openweather
connectors-ci test-connectors --modified # Test only connectors modified in the current branch

Recommended reading order

  1. The README !!!
  2. The entrypoints with DAG logic
  3. A context object to customize execution and hold pipeline configuration
  4. The actual tests we run
  5. The reusable environments

All the other change (commits before 0431cf6) are made to improve/patch our existing tooling to work in a CI context with Dagger.

The target DAG

flowchart LR;
    AB_GIT_REPO[Airbyte Git Repo] --> MOUNT_AB[Mount Airbyte repo to container];
    AB_GIT_REPO[Airbyte Git Repo] --> MOUNT_CONN[Mount connector source code to container];
    DOWN_SECRETS[Download secrets from GSM]-->CAT
    MOUNT_AB-->QA[Run QA checks];
    MOUNT_CONN-->FORMAT[Code format checks];
    MOUNT_CONN-->INSTALL_CONN[Install connector package in container];
    INSTALL_CONN-->UNIT_TESTS[Run unit tests];
    UNIT_TESTS-->INTEGRATION_TESTS[Run integration tests];
    UNIT_TESTS-->DOCKER_BUILD[Docker build connector dev image];
    DOCKER_BUILD-->CAT[Run acceptance tests];
    CAT-->UPLOAD_SECRETS[Upload updated secrets to GSM];
    CAT-->REPORT[Build test report];
    UNIT_TESTS-->REPORT;
    INTEGRATION_TESTS-->REPORT;
    QA-->REPORT;
    FORMAT-->REPORT;
    REPORT--if in CI-->UPLOAD[Upload to S3];
Loading

🚨 User Impact 🚨

Before assessing a user impact I'd like to first get my question to Dagger teams answered:
https://github.com/airbytehq/airbyte/blob/augustin/poc/ci-in-python/tools/ci_connector_ops/ci_connector_ops/pipelines/README.md

@alafanechere alafanechere changed the title ci_connector_ops: POC of CI pipelines in python ci_connector_ops: POC of CI connector pipelines in python Feb 27, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Feb 27, 2023

Affected Connector Report

The latest commit has removed all connector-related changes. There are no more dependent connectors for this PR.

@alafanechere alafanechere force-pushed the augustin/poc/ci-in-python branch from 7d7e48b to 593a81b Compare February 27, 2023 16:37
@alafanechere alafanechere force-pushed the augustin/poc/ci-in-python branch from 593a81b to e388dbf Compare February 27, 2023 16:45
@alafanechere alafanechere marked this pull request as ready for review February 27, 2023 16:53
@@ -44,37 +44,38 @@ def output_folder(self) -> Path:
def input_folder(self) -> Path:
return self._volume_base / f"run_{self._runs}" / "input"

def _prepare_volumes(self, config: Optional[Mapping], state: Optional[Mapping], catalog: Optional[ConfiguredAirbyteCatalog]):
def _prepare_image(self, config: Optional[Mapping], state: Optional[Mapping], catalog: Optional[ConfiguredAirbyteCatalog]):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a patch whose reason is detailed here and that I'd prefer to avoid.

@@ -42,7 +42,7 @@ extend-ignore = [
[tool.isort]
profile = "black"
color_output = false
skip_gitignore = true
# skip_gitignore = true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to allow black to run in a non-git context.

Copy link
Contributor

@evantahler evantahler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool POC!

### Install
```bash
cd tools/ci_connector_ops
python -m venv .venv (please use Python 3.10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we enforce or change our python version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package declares that it requires Python 3.10 here:

python_requires=">=3.10",

I'm using pyenv to switch my python version to 3.10 when I'm working on this project. Our developers should have pyenv installed as I think it's part of our core python development tooling.

This file says to pyenv to switch to 3.10 when you're working on this code directory:

Copy link
Contributor

@helderco helderco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 👍 🙂

I didn't review everything because I need to stop for now and don't want to leave it pending.

tools/ci_connector_ops/setup.py Outdated Show resolved Hide resolved
tools/ci_connector_ops/ci_connector_ops/pipelines/utils.py Outdated Show resolved Hide resolved
tools/ci_connector_ops/ci_connector_ops/pipelines/utils.py Outdated Show resolved Hide resolved
@alafanechere
Copy link
Contributor Author

Thank you so much @helderco for your review and suggestions!!!

@octavia-squidington-iii octavia-squidington-iii added the area/connectors Connector related issues label Mar 10, 2023
@alafanechere alafanechere force-pushed the augustin/poc/ci-in-python branch from 5329bd4 to 7e514e8 Compare March 10, 2023 15:15
@alafanechere alafanechere force-pushed the augustin/poc/ci-in-python branch from 7e514e8 to 36e7508 Compare March 10, 2023 15:21
@alafanechere alafanechere force-pushed the augustin/poc/ci-in-python branch from c79568a to 5e9d05e Compare March 10, 2023 17:50
@alafanechere alafanechere force-pushed the augustin/poc/ci-in-python branch from 5e9d05e to bcff84c Compare March 10, 2023 17:54
@octavia-squidington-iii octavia-squidington-iii removed the area/connectors Connector related issues label Mar 10, 2023
@alafanechere
Copy link
Contributor Author

alafanechere commented Mar 10, 2023

I'll merge now. There are still great improvements to make but I feel like they should be done incrementally. IMO the current state of this PR accomplished the goal of reimplementing airbyte-python.gradle and airbyte-connector-acceptance.gradle in Python with Dagger.
We'll observe how it behaves on connector PRs after the merge and probably find interesting learnings.
Here are the main improvements to make according to the review from @helderco and my personal opinion:

  • Rework the concurrency implementation. We shall rely only on aniyo and not call asyncio native code or asyncer.asyncify.
  • The DAG logic is not perfect: the integration tests should get secrets too.
  • The ci_connector_ops python package and dependencies are becoming difficult to manage because we stuffed it with different projects that have very diverse needs in terms of python version and libraries.
  • We need to focus a bit more on making sure each step re-execution is legit and leverage caching as much as possible. (e.g. writing secrets on every run leads to re-trigger of the acceptance tests on every run because the secrets file timestamp changed even if their content is different).
  • We might consider relying more on OOP to benefit from the abstraction that can be useful to run pipelines for Python and Java connectors with common logic and reusable code.

cc @evantahler @cpdeethree @bnchrch @jpadams @kpenfound

@alafanechere alafanechere enabled auto-merge (squash) March 10, 2023 18:24
@alafanechere alafanechere merged commit e4f96d4 into master Mar 10, 2023
@alafanechere alafanechere deleted the augustin/poc/ci-in-python branch March 10, 2023 18:45
@bazarnov
Copy link
Collaborator

@alafanechere the https://github.com/airbytehq/airbyte/actions/runs/4391568285 test is failing but says success, not sure what's going on here, need more context around the work you've done, but I think I can safely merge in the related PR:
#23822

Am I right? or should I wait until it's fixed on your side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ci_connector_ops: POC rewriting airbyte-python.gradle and airbyte-connector-acceptance.gradle in Python
9 participants