Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove release stage references in favor of support level or ab_internal #29232

Merged
merged 20 commits into from
Aug 15, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/connectors_nightly_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@ on:
default: conn-nightly-xlarge-runner
required: true
test-connectors-options:
default: --concurrency=5 --release-stage=generally_available --release-stage=beta
default: --concurrency=5 --support-level=certified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the metadata are up to date to match the same subset of connector nightly build originally targets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They dont match the same subset unfortunately. There will likely be a bit of a mix up

required: true

run-name: "Test connectors: ${{ inputs.test-connectors-options || 'nightly build for GA and Beta connectors' }} - on ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}"
run-name: "Test connectors: ${{ inputs.test-connectors-options || 'nightly build for Certified connectors' }} - on ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}"

jobs:
test_connectors:
name: "Test connectors: ${{ inputs.test-connectors-options || 'nightly build for GA and Beta connectors' }} - on ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}"
name: "Test connectors: ${{ inputs.test-connectors-options || 'nightly build for Certified connectors' }} - on ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}"
timeout-minutes: 720 # 12 hours
runs-on: ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}
steps:
Expand All @@ -41,4 +41,4 @@ jobs:
sentry_dsn: ${{ secrets.SENTRY_AIRBYTE_CI_DSN }}
git_branch: ${{ steps.extract_branch.outputs.branch }}
github_token: ${{ secrets.GH_PAT_MAINTENANCE_OCTAVIA }}
subcommand: "connectors ${{ inputs.test-connectors-options || '--concurrency=8 --release-stage=generally_available --release-stage=beta' }} test"
subcommand: "connectors ${{ inputs.test-connectors-options || '--concurrency=8 --support-level=certified' }} test"
8 changes: 4 additions & 4 deletions .github/workflows/connectors_weekly_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@ on:
default: conn-nightly-xlarge-runner
required: true
test-connectors-options:
default: --concurrency=3 --release-stage=alpha
default: --concurrency=3 --support-level=community
required: true

run-name: "Test connectors: ${{ inputs.test-connectors-options || 'weekly build for Alpha connectors' }} - on ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}"
run-name: "Test connectors: ${{ inputs.test-connectors-options || 'weekly build for Community connectors' }} - on ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}"

jobs:
test_connectors:
name: "Test connectors: ${{ inputs.test-connectors-options || 'weekly build for Alpha connectors' }} - on ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}"
name: "Test connectors: ${{ inputs.test-connectors-options || 'weekly build for Community connectors' }} - on ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}"
timeout-minutes: 8640 # 6 days
runs-on: ${{ inputs.runs-on || 'conn-nightly-xlarge-runner' }}
steps:
Expand All @@ -41,4 +41,4 @@ jobs:
gcp_gsm_credentials: ${{ secrets.GCP_GSM_CREDENTIALS }}
git_branch: ${{ steps.extract_branch.outputs.branch }}
github_token: ${{ secrets.GH_PAT_MAINTENANCE_OCTAVIA }}
subcommand: "connectors ${{ inputs.test-connectors-options || '--concurrency=3 --release-stage=alpha' }} test"
subcommand: "connectors ${{ inputs.test-connectors-options || '--concurrency=3 --support-level=community' }} test"
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,33 @@

from connector_ops import utils

RELEASE_STAGE_TO_STRICTNESS_LEVEL_MAPPING = {"generally_available": "high"}
HIGH_TEST_STRICTNESS_LEVEL_THRESHOLD = {
"sl": 300,
"ql": 400,
}
BACKWARD_COMPATIBILITY_REVIEWERS = {"connector-operations", "connector-extensibility"}
TEST_STRICTNESS_LEVEL_REVIEWERS = {"connector-operations"}
GA_BYPASS_REASON_REVIEWERS = {"connector-operations"}
GA_CONNECTOR_REVIEWERS = {"gl-python"}
REVIEW_REQUIREMENTS_FILE_PATH = ".github/connector_org_review_requirements.yaml"

def is_important_connector(connector: utils.Connector) -> bool:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opted to use the idea of important connector which is either a high sl and/or high ql.

This is because we may have a connector that is deemed qualitatively important (SL) and we should enable high strictness in advance of a high QL.

defn: https://docs.google.com/document/d/161FF__wOjwS6j-1fVyHexDD1cJCQ5RJ2LMC8FI2N6yI/edit

The alternative is to use just QL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to see this exposed as a Connector property as it might be useful in pipelines too.

"""Check if a connector requires higher scrutiny.

Args:
connector (utils.Connector): The connector to check.

Returns:
bool: True if the connector requires high test strictness level, False otherwise.
"""
if connector.ab_internal_sl >= HIGH_TEST_STRICTNESS_LEVEL_THRESHOLD["sl"]:
return True

if connector.ab_internal_ql >= HIGH_TEST_STRICTNESS_LEVEL_THRESHOLD["ql"]:
return True

return False


def find_connectors_with_bad_strictness_level() -> List[utils.Connector]:
"""Check if changed connectors have the expected connector acceptance test strictness level according to their release stage.
Expand All @@ -30,51 +50,49 @@ def find_connectors_with_bad_strictness_level() -> List[utils.Connector]:
connectors_with_bad_strictness_level = []
changed_connector = utils.get_changed_connectors(destination=False, third_party=False)
for connector in changed_connector:
expected_test_strictness_level = RELEASE_STAGE_TO_STRICTNESS_LEVEL_MAPPING.get(connector.release_stage)
can_check_strictness_level = all(
[item is not None for item in [connector.release_stage, expected_test_strictness_level, connector.acceptance_test_config]]
)
if can_check_strictness_level:
requires_high_test_strictness_level = is_important_connector(connector)
check_for_high_strictness = connector.acceptance_test_config is not None and requires_high_test_strictness_level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if requires_high_test_strictness_level could be a Connector property too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely!

if check_for_high_strictness:
try:
assert connector.acceptance_test_config.get("test_strictness_level") == expected_test_strictness_level
assert connector.acceptance_test_config.get("test_strictness_level") == "high"
except AssertionError:
connectors_with_bad_strictness_level.append(connector)
return connectors_with_bad_strictness_level


def find_changed_ga_connectors() -> Set[utils.Connector]:
"""Find GA connectors modified on the current branch.
def find_changed_important_connectors() -> Set[utils.Connector]:
"""Find important connectors modified on the current branch.

Returns:
Set[utils.Connector]: The set of GA connectors that were modified on the current branch.
"""
changed_connectors = utils.get_changed_connectors(destination=False, third_party=False)
return {connector for connector in changed_connectors if connector.release_stage == "generally_available"}
return {connector for connector in changed_connectors if is_important_connector(connector)}


def get_ga_bypass_reason_changes() -> Set[utils.Connector]:
"""Find GA connectors that have modified bypass_reasons.
def get_bypass_reason_changes() -> Set[utils.Connector]:
"""Find connectors that have modified bypass_reasons.

Returns:
Set[str]: Set of connector names e.g {"source-github"}: The set of GA connectors that have changed bypass_reasons.
"""
bypass_reason_changes = utils.get_changed_acceptance_test_config(diff_regex="bypass_reason")
return bypass_reason_changes.intersection(find_changed_ga_connectors())
return bypass_reason_changes.intersection(find_changed_important_connectors())


def find_mandatory_reviewers() -> List[Union[str, Dict[str, List]]]:
ga_connector_changes = find_changed_ga_connectors()
important_connector_changes = find_changed_important_connectors()
backward_compatibility_changes = utils.get_changed_acceptance_test_config(diff_regex="disable_for_version")
test_strictness_level_changes = utils.get_changed_acceptance_test_config(diff_regex="test_strictness_level")
ga_bypass_reason_changes = get_ga_bypass_reason_changes()
ga_bypass_reason_changes = get_bypass_reason_changes()

if backward_compatibility_changes:
return [{"any-of": list(BACKWARD_COMPATIBILITY_REVIEWERS)}]
if test_strictness_level_changes:
return [{"any-of": list(TEST_STRICTNESS_LEVEL_REVIEWERS)}]
if ga_bypass_reason_changes:
return [{"any-of": list(GA_BYPASS_REASON_REVIEWERS)}]
if ga_connector_changes:
if important_connector_changes:
return list(GA_CONNECTOR_REVIEWERS)
return []

Expand All @@ -83,7 +101,7 @@ def check_test_strictness_level():
connectors_with_bad_strictness_level = find_connectors_with_bad_strictness_level()
if connectors_with_bad_strictness_level:
logging.error(
f"The following GA connectors must enable high test strictness level: {connectors_with_bad_strictness_level}. Please check this documentation for details: https://docs.airbyte.com/connector-development/testing-connectors/connector-acceptance-tests-reference/#strictness-level"
f"The following connectors must enable high test strictness level: {connectors_with_bad_strictness_level}. Please check this documentation for details: https://docs.airbyte.com/connector-development/testing-connectors/connector-acceptance-tests-reference/#strictness-level"
)
sys.exit(1)
else:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,35 @@

from connector_ops import utils

RELEASE_STAGES_TO_CHECK = ["generally_available", "beta"]
ALLOWED_HOST_THRESHOLD = {
"sl": 200,
"ql": 300,
}

def _requires_allowed_hosts(connector: utils.Connector) -> bool:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""Check if a connector requires allowed hosts.

Args:
connector (utils.Connector): The connector to check.

Returns:
bool: True if the connector requires allowed hosts, False otherwise.
"""
if connector.ab_internal_sl >= ALLOWED_HOST_THRESHOLD["sl"]:
return True

if connector.ab_internal_ql >= ALLOWED_HOST_THRESHOLD["ql"]:
return True

return False


def get_connectors_missing_allowed_hosts() -> List[utils.Connector]:
connectors_missing_allowed_hosts: List[utils.Connector] = []
changed_connectors = utils.get_changed_connectors(destination=False, third_party=False)

for connector in changed_connectors:
if connector.release_stage in RELEASE_STAGES_TO_CHECK:
if _requires_allowed_hosts(connector):
missing = not connector_has_allowed_hosts(connector)
if missing:
connectors_missing_allowed_hosts.append(connector)
Expand All @@ -31,7 +51,7 @@ def connector_has_allowed_hosts(connector: utils.Connector) -> bool:
def check_allowed_hosts():
connectors_missing_allowed_hosts = get_connectors_missing_allowed_hosts()
if connectors_missing_allowed_hosts:
logging.error(f"The following {RELEASE_STAGES_TO_CHECK} connectors must include allowedHosts: {connectors_missing_allowed_hosts}")
logging.error(f"The following connectors must include allowedHosts: {connectors_missing_allowed_hosts}")
sys.exit(1)
else:
sys.exit(0)
28 changes: 26 additions & 2 deletions airbyte-ci/connectors/connector_ops/connector_ops/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from glob import glob
from pathlib import Path
from typing import List, Optional, Set, Tuple, Union
from pydash.objects import get

import git
import requests
Expand Down Expand Up @@ -272,9 +273,32 @@ def version_in_dockerfile_label(self) -> str:
def name_from_metadata(self) -> Optional[str]:
return self.metadata.get("name") if self.metadata else None


@property
def support_level(self) -> Optional[str]:
return self.metadata.get("supportLevel") if self.metadata else None

@property
def release_stage(self) -> Optional[str]:
return self.metadata.get("releaseStage") if self.metadata else None
def ab_internal_sl(self) -> int:
default_value = 100
sl_value = get(self.metadata, "ab_internal.sl")

if sl_value is None:
logging.warning(f"Connector {self.technical_name} does not have a support level defined in metadata.yaml. Defaulting to {default_value}")
return default_value

return sl_value

@property
def ab_internal_ql(self) -> int:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should already be defined but putting defaults here in case.

These should be converted to hard errors on the introduction of metadata v2 where ab_internal is required

default_value = 100
ql_value = get(self.metadata, "ab_internal.ql")

if ql_value is None:
logging.warning(f"Connector {self.technical_name} does not have a support level defined in metadata.yaml. Defaulting to {default_value}")
return default_value

return ql_value

@property
def allowed_hosts(self) -> Optional[List[str]]:
Expand Down
2 changes: 1 addition & 1 deletion airbyte-ci/connectors/connector_ops/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ def dummy_qa_report() -> pd.DataFrame:
"connector_technical_name": "source-test",
"connector_definition_id": "foobar",
"connector_version": "0.0.0",
"release_stage": "alpha",
"support_level": "community",
"is_on_cloud": False,
"is_appropriate_for_cloud_use": True,
"latest_build_is_successful": True,
Expand Down
12 changes: 6 additions & 6 deletions airbyte-ci/connectors/connector_ops/tests/test_qa_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,14 +146,14 @@ def test_check_connector_https_url_only_all_connectors():
if not qa_checks.check_connector_https_url_only(connector):
failing_connectors.append(connector)
if failing_connectors:
by_release_stage = {}
by_ab_internal_sl = {}
for failing_connector in failing_connectors:
by_release_stage.setdefault(failing_connector.release_stage, [])
by_release_stage[failing_connector.release_stage].append(failing_connector)
by_ab_internal_sl.setdefault(failing_connector.ab_internal_sl, [])
by_ab_internal_sl[failing_connector.ab_internal_sl].append(failing_connector)
failure_message = ""
for release_stage in by_release_stage.keys():
failure_message += f"\nFailing {release_stage} connectors:\n"
for connector in by_release_stage[release_stage]:
for ab_internal_sl in by_ab_internal_sl.keys():
failure_message += f"\nFailing SL {ab_internal_sl} connectors:\n"
for connector in by_ab_internal_sl[ab_internal_sl]:
failure_message += f"\t- {connector.technical_name}\n"
pytest.fail(failure_message)

Expand Down
4 changes: 2 additions & 2 deletions airbyte-ci/connectors/connector_ops/tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,13 @@ def test_init(self, connector, exists, mocker, tmp_path):

if exists:
assert isinstance(connector.metadata, dict)
assert isinstance(connector.release_stage, str)
assert isinstance(connector.support_level, str)
assert isinstance(connector.acceptance_test_config, dict)
assert connector.icon_path == Path(f"./airbyte-config-oss/init-oss/src/main/resources/icons/{connector.metadata['icon']}")
assert len(connector.version.split(".")) == 3
else:
assert connector.metadata is None
assert connector.release_stage is None
assert connector.support_level is None
assert connector.acceptance_test_config is None
assert connector.icon_path == Path(f"./airbyte-config-oss/init-oss/src/main/resources/icons/{connector.name}.svg")
with pytest.raises(FileNotFoundError):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,79 +100,6 @@ def compute_registry_overrides(merged_df):
return registries


def merge_into_metadata_definitions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it going away?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an artifact from when we transformed the old registry into the metadata files at the very begining.

Because we no longer do that, this function is unused, it just happened to be missed by the grim reaper last time

id_field: str, connector_type: str, oss_connector_df: pd.DataFrame, cloud_connector_df: pd.DataFrame
) -> List[PartialMetadataDefinition]:
"""Merges the OSS and Cloud connector metadata into a single metadata definition.

Args:
id_field (str): The field that uniquely identifies a connector.
connector_type (str): The type of connector (source or destination).
oss_connector_df (pd.DataFrame): The dataframe containing the related OSS connector in the registry.
cloud_connector_df (pd.DataFrame): The dataframe containing the related Cloud connector in the registry.

Returns:
pd.Series: The merged metadata definition.
"""
merged_connectors = pd.merge(
oss_connector_df, cloud_connector_df, on=id_field, how="outer", suffixes=(OSS_SUFFIX, CLOUD_SUFFIX), indicator=True
)

# Replace numpy nan values with None
sanitized_connectors = merged_connectors.replace([np.nan], [None])

def build_metadata(connector_registry_entry: dict) -> PartialMetadataDefinition:
"""Builds the metadata definition for a single connector.

Args:
connector_registry_entry (dict): The merged connector metadata from the existing json registries.

Returns:
PartialMetadataDefinition: The final metadata definition.
"""
raw_data = {
"name": get_field_with_fallback(connector_registry_entry, "name"),
"definitionId": connector_registry_entry[id_field],
"connectorType": connector_type,
"dockerRepository": get_field_with_fallback(connector_registry_entry, "dockerRepository"),
"githubIssueLabel": get_field_with_fallback(connector_registry_entry, "dockerRepository").replace("airbyte/", ""),
"dockerImageTag": get_field_with_fallback(connector_registry_entry, "dockerImageTag"),
"icon": get_field_with_fallback(connector_registry_entry, "icon"),
"supportUrl": get_field_with_fallback(connector_registry_entry, "documentationUrl"),
"connectorSubtype": get_field_with_fallback(connector_registry_entry, "sourceType"),
"releaseStage": get_field_with_fallback(connector_registry_entry, "releaseStage"),
"license": "MIT",
"supportsDbt": get_field_with_fallback(connector_registry_entry, "supportsDbt"),
"supportsNormalization": get_field_with_fallback(connector_registry_entry, "supportsNormalization"),
"allowedHosts": get_field_with_fallback(connector_registry_entry, "allowedHosts"),
"normalizationConfig": get_field_with_fallback(connector_registry_entry, "normalizationConfig"),
"suggestedStreams": get_field_with_fallback(connector_registry_entry, "suggestedStreams"),
"resourceRequirements": get_field_with_fallback(connector_registry_entry, "resourceRequirements"),
}

# remove none values
data = {k: v for k, v in raw_data.items() if v is not None}

metadata = {"metadataSpecVersion": "1.0", "data": data}

registries = compute_registry_overrides(connector_registry_entry)
metadata["data"]["registries"] = registries

return PartialMetadataDefinition.construct(**metadata)

metadata_list = [build_metadata(connector_registry_entry) for _, connector_registry_entry in sanitized_connectors.iterrows()]

return metadata_list


def validate_metadata(metadata: PartialMetadataDefinition) -> tuple[bool, str]:
try:
ConnectorMetadataDefinitionV0.parse_obj(metadata)
return True, None
except Exception as e:
return False, str(e)


# ASSETS


Expand Down
Loading