Skip to content

Commit

Permalink
Fixup dead links (airbytehq#24167)
Browse files Browse the repository at this point in the history
  • Loading branch information
evantahler authored and adriennevermorel committed Mar 17, 2023
1 parent 40ab566 commit 6c02fb4
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 30 deletions.
20 changes: 12 additions & 8 deletions airbyte-integrations/bases/standard-source-test/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,31 +3,35 @@
## Overview

### These tests are designed to do the following:

1. Test basic functionality for any Airbyte source. Think of it as an "It works!" test.
1. Each test should test functionality that is expected of all Airbyte sources.
1. Each test should test functionality that is expected of all Airbyte sources.
1. Require minimum effort from the author of the integration.
1. Think of these are "free" tests that Airbyte provides to make sure the integration works.
1. Think of these are "free" tests that Airbyte provides to make sure the integration works.
1. To be run for ALL Airbyte sources.

### These tests are _not_ designed to do the following:

1. Test any integration-specific cases.
1. Test corner cases for specific integrations.
1. Replace good unit testing and integration testing for the integration.

We will _not_ merge sources that cannot pass these tests unless the author can provide a very good reason™.

## What tests do the standard tests include?

Check out each function in [SourceAcceptanceTest](src/main/java/io/airbyte/integrations/standardtest/source/SourceAcceptanceTest.java). Each function annotated with `@Test` is a single test. Each of these functions is proceeded by comments that document what they are testing.

## How to run them from your integration?
* If writing a source in Python, check out this [readme](../base-python-test/readme.md)

- If writing a source in Python, **don't use this.**

## What do I need to provide as input to these tests?
* The name of the image of your integration (with the `:dev` tag at the end).
* A handful of json inputs, e.g. a valid configuration file and a catalog that will be used to try to run the `read` operation on your integration. These are fully documented in [SourceAcceptanceTest](src/main/java/io/airbyte/integrations/standardtest/source/SourceAcceptanceTest.java). Each method that is marked as `abstract` are methods that the user needs to implement to provide the necessary information. Each of these methods are preceded by comments explaining what they need to return.
* Optionally you can run before and after methods before each test.

- The name of the image of your integration (with the `:dev` tag at the end).
- A handful of json inputs, e.g. a valid configuration file and a catalog that will be used to try to run the `read` operation on your integration. These are fully documented in [SourceAcceptanceTest](src/main/java/io/airbyte/integrations/standardtest/source/SourceAcceptanceTest.java). Each method that is marked as `abstract` are methods that the user needs to implement to provide the necessary information. Each of these methods are preceded by comments explaining what they need to return.
- Optionally you can run before and after methods before each test.

## Do I have to write java to use these tests?
No! Our goal is to allow you to write your integration _entirely_ in your language of choice. If you are writing an integration in Python, for instance, you should be able to interact with this test suite in python and not need to write java. _Right now, we only have a Python interface to reduce friction for interacting with these tests_, and with time, we intend to make more language-specific helpers available. In the meantime, however, you can still use your language of choice and leverage this standard test suite.

If working in Python, the Python interface that you need to implement can be found [here](../base-python-test/base_python_test/test_iface.py). Our [python template](../../connector-templates/source-python/README.md.hbs) and [singer template](../../connector-templates/source-singer/README.md.hbs) also walk you through how to implement the interface.
No! Our goal is to allow you to write your integration _entirely_ in your language of choice. If you are writing an integration in Python, for instance, you should be able to interact with this test suite in python and not need to write java. _Right now, we only have a Python interface to reduce friction for interacting with these tests_, and with time, we intend to make more language-specific helpers available. In the meantime, however, you can still use your language of choice and leverage this standard test suite.
53 changes: 31 additions & 22 deletions tools/ci_connector_ops/ci_connector_ops/pipelines/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
# POC of CI connector pipelines in python

This Python subpackage of `ci-connector-ops` gathers the POC code we're working on to:

- Rewrite [airbyte-python.gradle](https://github.com/airbytehq/airbyte/blob/7d7e48b2a342a328fa74c6fd11a9268e1dcdcd64/buildSrc/src/main/groovy/airbyte-python.gradle) and [airbyte-connector-acceptance-test.gradle](https://github.com/airbytehq/airbyte/blob/master/buildSrc/src/main/groovy/airbyte-connector-acceptance-test.gradle) in Python.
- Centralize the CI logic for connector testing
- Try out Dagger.io as a promising tool that can provide parallelism and caching out of the box for CI
- Centralize the CI logic for connector testing
- Try out Dagger.io as a promising tool that can provide parallelism and caching out of the box for CI

## Install and use

From `airbyte` repo root:

### Install

```bash
cd tools/ci_connector_ops
python -m venv .venv (please use at least Python 3.10)
Expand All @@ -21,6 +24,7 @@ cd ../..
### Use

### Use remote secrets

If you want the pipeline to pull connector secrets from Google Secrets manager you have to set the `GCP_GSM_CREDENTIALS` env variable.
If you don't set this variable the local secrets files, under the `secrets` directory of a connector, will be used for acceptance test run.
More details [here](https://github.com/airbytehq/airbyte/blob/master/tools/ci_credentials/README.md#L20).
Expand All @@ -30,20 +34,23 @@ export GCP_GSM_CREDENTIALS=`cat <path to service account json file>`
```

If you don't want to use the remote secrets please call connectors-ci with the following flag:

```bash
connectors-ci --use-remote-secrets=False
```

### Environment variables required for CI run:
* `GCP_GSM_CREDENTIALS`: the credentials to connect to GSM
* `TEST_REPORTS_BUCKET_NAME`: the name of the bucket where the test report will be uploaded.
* `AWS_ACCESS_KEY_ID`: the access key id of a service account allowed to write to `TEST_REPORTS_BUCKET_NAME`
* `AWS_SECRET_ACCESS_KEY`: the secret access key of a service account allowed to write to`TEST_REPORTS_BUCKET_NAME`
* `AWS_REGION`: The AWS region of the `TEST_REPORTS_BUCKET_NAME`

- `GCP_GSM_CREDENTIALS`: the credentials to connect to GSM
- `TEST_REPORTS_BUCKET_NAME`: the name of the bucket where the test report will be uploaded.
- `AWS_ACCESS_KEY_ID`: the access key id of a service account allowed to write to `TEST_REPORTS_BUCKET_NAME`
- `AWS_SECRET_ACCESS_KEY`: the secret access key of a service account allowed to write to`TEST_REPORTS_BUCKET_NAME`
- `AWS_REGION`: The AWS region of the `TEST_REPORTS_BUCKET_NAME`

### **Run the pipelines for a specific connectors**

(source-pokeapi does not require GSM access)

```bash
connectors-ci test-connectors --name=source-pokeapi
```
Expand All @@ -53,13 +60,13 @@ connectors-ci test-connectors --name=source-pokeapi
```bash
connectors-ci test-connectors --name=source-pokeapi --name=source-openweather
```

### **Run the pipeline for generally available connectors**

```bash
connectors-ci test-connectors --release-stage=generally_available
```


### **Run the pipeline for the connectors you changed on the branch**

```bash
Expand All @@ -68,16 +75,18 @@ connectors-ci test-connectors --modified #the source-pokeapi pipeline should run
```

### Local VS. CI

The default behavior of the CLI is to run in a local context.
You can tell the CLI that it is running in a CI context with the following flag:

```bash
connectors-ci --is-ci
```

The main differences are that:
- The pipeline will pull the branch under test from Airbyte's GitHub repo
- The pipeline will upload per connector test reports to S3

- The pipeline will pull the branch under test from Airbyte's GitHub repo
- The pipeline will upload per connector test reports to S3

## What does a connector pipeline run

Expand Down Expand Up @@ -113,37 +122,37 @@ Please ignore the `Set your environment variables` step and focus on running `co

1. ~~How to handle exit codes: if exit_code != 0 an exception is thrown and stops the other pipeline execution. Code context [here](https://github.com/airbytehq/airbyte/blob/7d7e48b2a342a328fa74c6fd11a9268e1dcdcd64/tools/ci_connector_ops/ci_connector_ops/pipelines/actions/tests.py#L25)~~ A stop-gap solution was implemented waiting for this [issue](https://github.com/dagger/dagger/issues/3192) to be fixed.
2. Can we make with_mounted_directory writable so that the container can write to the host FS? Code context [here](https://github.com/airbytehq/airbyte/blob/7d7e48b2a342a328fa74c6fd11a9268e1dcdcd64/tools/ci_connector_ops/ci_connector_ops/pipelines/actions/tests.py#L119)
Dagger team answer: We'll implement a flag to run privileged `with_exec` that will allow containers to write on the host FS.
Dagger team answer: We'll implement a flag to run privileged `with_exec` that will allow containers to write on the host FS.
3. How to get access to visualizations: We'd love to have dynamic status checks on our GitHub PRs, with links to pipeline visualization [like](https://propeller.fly.dev/runs/da68273e-48d8-4354-8d8b-efaccf2792b9).
Dagger team answer: coming soon.
Dagger team answer: coming soon.
4. Can we build and tag an image in Dagger?
Dagger team answer: Run a local docker registry and publish images to this directory during the pipeline execution.
Dagger team answer: Run a local docker registry and publish images to this directory during the pipeline execution.
5. What are the best practices to report success failure details?
I built a custom models (`ConnectorTestReport`) to store test results. I archive tests results to S3. Do you have other suggestions?
I built a custom models (`ConnectorTestReport`) to store test results. I archive tests results to S3. Do you have other suggestions?
6. What are the best practices to write tests for pipelines?
Dagger team answer: Get inspirations from their own repo [here](https://github.com/dagger/dagger/tree/main/sdk/python/tests).
I'll write test once the code architecture is ratified.
Dagger team answer: Get inspirations from their own repo [here](https://github.com/dagger/dagger/tree/main/sdk/python/tests).
I'll write test once the code architecture is ratified.

7. What would be your suggestion to run `docker scan` from dagger to spot vulnerabilities on our connectors?
Dagger team answer: A scanning tool should be wrapped in a container to scan images from Dagger.
Dagger team answer: A scanning tool should be wrapped in a container to scan images from Dagger.
8. Do you have a tool to re-order logs line by order of pipeline after execution?
A log grouping tool is under construction: https://www.youtube.com/watch&ab_channel=Dagger
A log grouping tool is under construction

**Second batch of questions**:

1. I used the stopgap to handle exit-code != 0 but I think as dagger considers the execution as failed this step is always re-run. (cf `run_integration_tests`). Am I understanding this problem correctly, is there a workaround?
2. `run_acceptance_tests` runs at every execution even if the artifacts passed to it did not change: the connector image id is not changing on rebuild of the same code and the secrets I download from GSM are not changing (but re-downloaded everytime because they might change). Is it because the directory instance I mount to it is different on each run? cf (`run_acceptance-tests`)
3. I tried to leverage concurrency as much as possible with `asyncio.create_task` and `asyncio.gather`. Is it something you expect developer to implement or is it not something I should bother with and consider Dagger will take care of concurrency at a lower level. (I noted performance improvement while using `asyncio.create_task` etc.)
3. I tried to leverage concurrency as much as possible with `asyncio.create_task` and `asyncio.gather`. Is it something you expect developer to implement or is it not something I should bother with and consider Dagger will take care of concurrency at a lower level. (I noted performance improvement while using `asyncio.create_task` etc.)

### Airbyte specific context that could help you understand our workflow.

- We always use a :dev tag to tag images of connector we build locally. We try to never publish these images to a public repository.
- We run a container called connector-acceptance-test which is a global test suite for all our connectors. This test suite is ran against a connector under test container, (usually using its :dev image).
- Connector-acceptance-test is a customizable test suite (built with pytest) configured with per-connector `acceptance-test-config.yml` files ([e.g.](https://github.com/airbytehq/airbyte/blob/b0c5f14db6a905899d0f9c043954abcc5ec296f0/airbyte-integrations/connectors/source-pokeapi/acceptance-test-config.yml#L1))
- As connector-acceptance-test is running connector containers, it triggers actual HTTP requests the public API of our source/destination connector. This is why we need to load secrets configuration with our test account credentials to these connector containers. connector-acceptance-test is also generating dynamic connector configurations to check the behavior of a connector under test when it is it different structure of configuration files.

- As connector-acceptance-test is running connector containers, it triggers actual HTTP requests the public API of our source/destination connector. This is why we need to load secrets configuration with our test account credentials to these connector containers. connector-acceptance-test is also generating dynamic connector configurations to check the behavior of a connector under test when it is it different structure of configuration files.

### Performance benchmarks

| Connector | Run integration test GHA duration | Dagger POC duration (CI no cache) |
| -------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------- |
| source-pokeapi | [7mn22s](https://github.com/airbytehq/airbyte/actions/runs/4395453220) | [5mn26s](https://github.com/airbytehq/airbyte/actions/runs/4403595746) |
| source-pokeapi | [7mn22s](https://github.com/airbytehq/airbyte/actions/runs/4395453220) | [5mn26s](https://github.com/airbytehq/airbyte/actions/runs/4403595746) |

0 comments on commit 6c02fb4

Please sign in to comment.