-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metadata-service[orchestrator]: generate connector registry with release candidates #44588
metadata-service[orchestrator]: generate connector registry with release candidates #44588
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @alafanechere and the rest of your teammates on Graphite |
@@ -59,7 +59,7 @@ def _convert_json_to_metrics_dict(jsonl_string: str) -> dict: | |||
|
|||
@asset(required_resource_keys={"latest_metrics_gcs_blob"}, group_name=GROUP_NAME) | |||
@sentry.instrument_asset_op | |||
def latest_connnector_metrics(context: OpExecutionContext) -> dict: | |||
def latest_connector_metrics(context: OpExecutionContext) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -102,27 +102,36 @@ def github_metadata_definitions(context): | |||
return Output(metadata_definitions, metadata={"preview": [md.json() for md in metadata_definitions]}) | |||
|
|||
|
|||
def entry_should_be_on_gcs(metadata_entry: LatestMetadataEntry) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
readability helper.
We might eventually want to change the stale metadata detection logic to get alerts when a RC does not end up in the registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alafanechere I agree, we definitely want to know if a release didn't start rolling out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function specifically referring to whether the entry should be in the "latest" bucket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I'll file a different issue to update the stale metadata detection to consider release candidates.
a3cd10d
to
948833a
Compare
@@ -330,29 +331,49 @@ def persist_registry_entry_to_json( | |||
return file_handle | |||
|
|||
|
|||
def generate_registry_entry( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took this out of generate_and_persist_registry_entry
to create nested registry entry for release candidates
raw_entry_dict = metadata_to_registry_entry(metadata_entry, registry_name) | ||
registry_entry_with_spec = apply_spec_to_registry_entry(raw_entry_dict, spec_cache, registry_name) | ||
|
||
_, ConnectorModel = get_connector_type_from_registry_entry(registry_entry_with_spec) | ||
|
||
registry_model = ConnectorModel.parse_obj(registry_entry_with_spec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to def generate_registry_entry
metadata_entry: Optional[LatestMetadataEntry], | ||
release_candidate_metadata_entries: Optional[List[LatestMetadataEntry]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here:
metadata_entry
corresponds to a metadata entry partitionrelease_candidate_metadata_entries
to an asset with all RC metadata
Matching a metadata entry to its release candidate metadata file is done in find_release_candidate_for_metadata_entry
.
@bnchrch I'm not 💯 this is the optimal approach if release_candidate_metadata_entries
is growing.
Would there be a mechanism, leveraging partitions, which could make this registry_entry
function take a metadata_entry
and a release_candidate_metadata_entry
partition? In other word the matching would happen elsewhere on partition generation.
In other words: I'm not sure what happens if an assets takes inputs from two different partition definitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah, so this is awkward because (I think) if
- A RC metadata gets removed or added
- Then all registry entries are reprocessed
(Can you confirm Im right about that?)
If I am then we have three options to fix
- We create multiple partition keys. So that the registry entry partitions key is a composite of
latest/version etag and RC etag
. This will be messy and complex. I dont recommend - On new RC metadata, after processing, we delete the partition key for its corresponding latest to force a reprocess. This also feels wrong.
- We hoist the release candidate processing to the registry level, since we want that regenerated on each new RC anyway.
I think #3 makes the most sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes you are right. The AutoMaterialization
leads to re-generation of all entries when the release_candidate_metadata_entries
is re-materialized... This is definitely not what we want.
We hoist the release candidate processing to the registry level,
I'm not sure I get what you mean there. Aren't we already at the registry level as this assets generate registry entries. You might mean to do it when we aggregated per connector registry entry in a the global registry. Sounds indeed possible. Will do it.
cfb9947
to
964e915
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm breaking the idempotency guarantee Dagster provides here by creating unique run keys.
The default behavior of Dagster is that a RunRequest
for a similar run key is skipped.
I had to work around that because I want the refresh_release_candidate_metadata_entries
to be rematerialized on metadata file deletion. Metadata delation with lead to a different cursor value, but will always likely be cursor value which has been seen in the past: so the run key will match an existing one and the RunRequest will not happen.
@bnchrch let me know if you have a different approach to suggest. This one feels slightly hacky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but will always likely be cursor value which has been seen in the past
Can you elaborate on this? I'm sure you're right but I'm not seeing why this would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe youve got it right.
It does feel hacky, but it seems to be exactly what dagster does for scheduled runs (e.g. append a timestamp to ensure it always runs) and I dont see a way around this with sensors.
And I feel protected since we skip when they are the same
But because it feels hacky lets
- use a better name for our flag to explain the intent (e.g.
allow_duplicate_runs
,allow_previous_runs
, etc..) - add a comment explaining what / why we do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on this? I'm sure you're right but I'm not seeing why this would be.
@clnoll
- Our sensor to check if a GCS path has changed is comparing a cursor value, which is hash of a list of blobs etags stored in the Dagster context to a hash of list of blobs existing on GCS.
- If the hash value changes the cursor changes and the sensor return a RunRequest
- a
run_key
is passed to the RunRequest. Thisrun_key
is set to the cursor value. - Dagster does not propagate a RunRequest if it has seen the same run_key value previously
- Let's say the sensor is computed for a list of blobs before an upload and the cursor value is
A
>run_key="A"
- Then we upload a new blob: cursor value is
B
>run_key="B"
. The sensor returns the RunRequest as the cursor changed, and Dagster schedules the run request as it's a new run_key value. - If we then delete the previously uploaded blob the cursor value will be
A
again (as the hash of the list of blobs is the same as before the upload). cursor valueA
>run_key="A"
. The sensor returns the RunRequest as the cursor changed from B to A . But as Dagster has already scheduled arun_key
withB
value it skips it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments & questions @alafanechere. I think I could use a synchronous conversation about this one, too, to make sure I get the gist.
patternProperties: | ||
"^\\d+\\.\\d+\\.\\d+$": | ||
$ref: "#/definitions/VersionBreakingChange" | ||
VersionBreakingChange: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This corresponds to an existing breaking changes object, right? If so I think we need to be able to share the code instead of duplicating it. It feels too critical to risk letting them get out of sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clnoll Oh yes that's true. Thanks for spotting it. Will create a separate single definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resources_def=METADATA_RESOURCE_TREE, | ||
gcs_blobs_resource_key="release_candidate_metadata_file_blobs", | ||
interval=60, | ||
unique_run_key=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - why do we want a unique run key here when we don't need it for the other sensors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we want asset rematerialization to happen on deletion of a blob.
CF: #44588 (comment)
] | ||
|
||
SCHEDULES = [ | ||
ScheduleDefinition(job=add_new_metadata_partitions, cron_schedule="*/2 * * * *", tags={"dagster/priority": HIGH_QUEUE_PRIORITY}), | ||
ScheduleDefinition( | ||
cron_schedule="0 1 * * *", # Daily at 1am US/Pacific | ||
cron_schedule="*/2 * * * *", # Every 2 minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this change?
@@ -102,27 +102,36 @@ def github_metadata_definitions(context): | |||
return Output(metadata_definitions, metadata={"preview": [md.json() for md in metadata_definitions]}) | |||
|
|||
|
|||
def entry_should_be_on_gcs(metadata_entry: LatestMetadataEntry) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alafanechere I agree, we definitely want to know if a release didn't start rolling out.
@@ -102,27 +102,36 @@ def github_metadata_definitions(context): | |||
return Output(metadata_definitions, metadata={"preview": [md.json() for md in metadata_definitions]}) | |||
|
|||
|
|||
def entry_should_be_on_gcs(metadata_entry: LatestMetadataEntry) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function specifically referring to whether the entry should be in the "latest" bucket?
if metadata_entry.metadata_definition.data.supportLevel == "archived": | ||
return False | ||
if getattr(metadata_entry.metadata_definition.releases, "isReleaseCandidate", False): | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this accurate? I think the point of truth about whether an entry is still being rolled out (and therefore shouldn't be in the latest bucket) is whether there is an RC in the RC registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clnoll the stale metadata detection compares metadata on master
to metadata in the latest
directory.
If we have a release candidate on master
it will not be in latest
until rollout finalization.
This is why I ignored them.
It's not great but it should not lead to false positive until we rework this stale metadata detection logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay cool, would you mind adding a comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but will always likely be cursor value which has been seen in the past
Can you elaborate on this? I'm sure you're right but I'm not seeing why this would be.
@@ -23,6 +23,13 @@ | |||
partitions_def=registry_entry.metadata_partitions_def, | |||
) | |||
|
|||
release_candidate_metadata_entries_inclusive = AssetSelection.keys("release_candidate_metadata_entries").upstream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the "inclusive" referring to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alafanechere Im worried about case 5
If im reading it right then the DX is after merge my connector publish will fail if a release candidate already exists. Perhaps its better if we just overwrite and have the platform reset the rollout process? |
raw_entry_dict = metadata_to_registry_entry(metadata_entry, registry_name) | ||
registry_entry_with_spec = apply_spec_to_registry_entry(raw_entry_dict, spec_cache, registry_name) | ||
|
||
if release_candidate_metadata_entries: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💅 Since were in the spirit of helper functions for readability I would recommend we do the same here
e.g. def apply_release_candidates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops old comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe youve got it right.
It does feel hacky, but it seems to be exactly what dagster does for scheduled runs (e.g. append a timestamp to ensure it always runs) and I dont see a way around this with sensors.
And I feel protected since we skip when they are the same
But because it feels hacky lets
- use a better name for our flag to explain the intent (e.g.
allow_duplicate_runs
,allow_previous_runs
, etc..) - add a comment explaining what / why we do this.
a5abe34
to
36376db
Compare
@bnchrch I'll discuss this with @clnoll . It's out of the scope of this PR as this logic is on the |
group_name=GROUP_NAME, | ||
) | ||
@sentry.instrument_asset_op | ||
def persisted_oss_registry(context: OpExecutionContext, latest_connnector_metrics: dict) -> Output[ConnectorRegistryV0]: | ||
def persisted_oss_registry( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bnchrch I reworked this function signature to declare input assets separately instead of fetching per registry entry blobs inside the persisted_oss_registry
assets.
generate_and_persist_registry
matches latest registry entries with RC entries + metadata.
So the releaseCandidates
key is added only on the global registry, not on per connector registry entry.
This is implement this suggestion:
We hoist the release candidate processing to the registry level, since we want that regenerated on each new RC anyway.
So now the logic is:
- The sensor on per connector registry entry blob change triggers the
generate_cloud_registry
asset job. - If I'm not mistaken this job's asset selection means that all the asset upstream to
persisted_oss_registry
will get rematerialized by the job. - As
persisted_oss_registry
depends on the new assets I declared it should get fresh rematerialized assets listing the existing release candidates.
@@ -23,6 +23,13 @@ | |||
partitions_def=registry_entry.metadata_partitions_def, | |||
) | |||
|
|||
release_candidate_metadata_entries_inclusive = AssetSelection.keys("release_candidate_metadata_entries").upstream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if metadata_entry.metadata_definition.data.supportLevel == "archived": | ||
return False | ||
if getattr(metadata_entry.metadata_definition.releases, "isReleaseCandidate", False): | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clnoll the stale metadata detection compares metadata on master
to metadata in the latest
directory.
If we have a release candidate on master
it will not be in latest
until rollout finalization.
This is why I ignored them.
It's not great but it should not lead to false positive until we rework this stale metadata detection logic.
if metadata_entry.metadata_definition.data.supportLevel == "archived": | ||
return False | ||
if getattr(metadata_entry.metadata_definition.releases, "isReleaseCandidate", False): | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay cool, would you mind adding a comment here?
def persisted_oss_registry(context: OpExecutionContext, latest_connnector_metrics: dict) -> Output[ConnectorRegistryV0]: | ||
def persisted_oss_registry( | ||
context: OpExecutionContext, | ||
latest_connector_metrics: dict, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we expect to be in latest_connector_metrics
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's some metrics fetched from the data warehouse which end up feeding fields related to usage metrics and sync success.
@@ -578,14 +600,55 @@ def metadata_entry(context: OpExecutionContext) -> Output[Optional[LatestMetadat | |||
return Output(value=metadata_entry, metadata=dagster_metadata) | |||
|
|||
|
|||
def find_release_candidates_for_metadata_entry( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this isn't used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. It's not used since my latest refactoring.
36376db
to
020bca9
Compare
020bca9
to
69bc834
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few questiosn and small stylistic changes! I imagine this will be ready on the next review!
Though I still do have a few minor to medium concerns
The DX of releasing subsequent release candidates:
I think the system should plan to allow automatic rollouts of minor and patch RC versions without manual intervention by a dev.
e.g. if im trying to release 4.1.1-rc
over 4.1.0-rc
our system should resolve that without the dev having to run any special command
We are no longer a DAG
Im still moderately concerned about our release DAG, no longer being a Directed or acyclic as the platform now has to call back to github/dagster to remove a metadata file
I think this structure will hurt more than we realize and Im not certain gaining a more stable oss connector release is worth it
if getattr(metadata_entry.metadata_definition.releases, "isReleaseCandidate", False): | ||
return False | ||
if ( | ||
datetime.datetime.strptime(metadata_entry.last_modified, "%a, %d %b %Y %H:%M:%S %Z").replace(tzinfo=datetime.timezone.utc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💅 Our grace period caculation should live in its own function with some docs
@@ -102,27 +102,41 @@ def github_metadata_definitions(context): | |||
return Output(metadata_definitions, metadata={"preview": [md.json() for md in metadata_definitions]}) | |||
|
|||
|
|||
def entry_should_be_on_gcs(metadata_entry: LatestMetadataEntry) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📚 Doc string explaining the why would be useful here!
release_candidate_registry_entries: List, | ||
release_candidate_metadata_entries: List, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Why do we have both RC registry entries and metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RC registry metadata contains the rolloutConfiguration
which is not defined on main version's registry entry.
RC registry entry do not have the rolloutConfiguration
, so we read them from the metadata entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok. Why dont we instead add rolloutConfiguration to the Registry Entry?
enriched_registry_entry_dict = apply_metrics_to_registry_entry(registry_entry_dict, connector_type, latest_connector_metrics) | ||
if ( | ||
latest_registry_entry.dockerRepository in docker_repository_to_rc_metadata_entry | ||
and latest_registry_entry.dockerRepository in docker_repository_to_rc_registry_entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💅 📚 I think we'd benefit from a apply_release_candidate_entries
function to hold this logic
registry_entry_dict = to_json_sanitized_dict(latest_registry_entry) | ||
enriched_registry_entry_dict = apply_metrics_to_registry_entry(registry_entry_dict, connector_type, latest_connector_metrics) | ||
if ( | ||
latest_registry_entry.dockerRepository in docker_repository_to_rc_metadata_entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ If im reading this right were trying to catch if the metadata has been deleted? Because the registry entry persists?
I think we instead should make sure when metadata files are deleted, so are registry entry files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this logic is not meant to handle deletion.
- We create a mapping between connector and existing RC registry and metadata entry.
- We iterate on
latest
registry entry and find connectors which have RC. - When a RC is found we modify the
latest
registry entry to add areleases.releaseCandidates.<rc-version>
field. The value of this field is aVersionReleaseCandidate
object with two properties:rolloutConfiguration
(coming from the RC metadata entry) andregistryEntry
(coming from the RC registry entry). The idea is to nest the RC entry under thelatest
registry entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok! Sorry, definately misread
|
||
@asset(required_resource_keys={"latest_cloud_registry_entries_file_blobs"}, group_name=GROUP_NAME) | ||
@sentry.instrument_asset_op | ||
def latest_cloud_registry_entries(context: OpExecutionContext) -> Output[List]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💎 Well done
The current manual intervention a dev would take would be to set the
Can you elaborate on this intuition (I think this structure will hurt more than we realize)? Here are some principles we decided to follow:
Finalization is implemented in this PR stack #44876 |
c188bd2
to
a95f522
Compare
34c97f1
to
a0250f2
Compare
@benchrch I reworked the models and registry generation logic to remove the dependency on the global registry generation on RC metadata.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1,6 +1,6 @@ | |||
[tool.poetry] | |||
name = "orchestrator" | |||
version = "0.4.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💅 We should run poetry update metadata-service
to update the version in the lock file.
IIRC theres an issue with dagster deploy and caching if we dont do this...
breaking_change, documentation_url, version | ||
) | ||
final_registry_releases["breakingChanges"] = breaking_changes | ||
if metadata.get("releases", {}).get("rolloutConfiguration"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💅 white space good!
if metadata.get("releases", {}).get("rolloutConfiguration"): | |
if metadata.get("releases", {}).get("rolloutConfiguration"): |
a0250f2
to
99635d5
Compare
What
Closes https://github.com/airbytehq/airbyte-internal-issues/issues/9253
On
metadata.yaml
file upload to the GCS bucket under path<connector-name>/release_candidate/metadata.yaml
we want to:latest
version of a connector we want to add areleases.releasesCandidates
nested field which will list all activereleaseCandidates
with their corresponding registry entry and rollout configuration.How
releases.releaseCandidates.<rc-version>.rolloutConfiguration
andreleases.releaseCandidates.<rc-version>.registryEntry
Review guide
assets
Expected registry state
No registry change as the publish command failed.
User Impact
There should be no impact until we release first RCs
Registry generation update demo
📺LOOM DEMO
Case 1: Normal release
Connector state
A new connector version is merged. Its metadata do not have a
releases.isReleaseCandidate
field.airbyte-ci connector publish
behavior4.2.0
andlatest
tags.airbyte-cloud-connector-metadata-service
to the following paths:/metadata/airbyte/source-airtable/4.2.0/metadata.yaml
/metadata/airbyte/source-airtable/latest/metadata.yaml
Expected registry state
The registry entry for the connector do not have a
releases.releaseCandidate
field.The registry entry points to the
4.2.0
version.Case 2: Release candidate
Connector state
A new connector version is merged. It's metadata have a
releases.isReleaseCandidate
field set totrue
.airbyte-ci connector publish
behavior4.3.0
. Nolatest
tag is pushed.airbyte-cloud-connector-metadata-service
to the following paths:/metadata/airbyte/source-airtable/4.2.0/metadata.yaml
/metadata/airbyte/source-airtable/release_candidate/metadata.yaml
Expected registry state
The registry entry points to the
4.2.0
version.The registry entry has
releases.releaseCandidate.4.3.0
nested field with the release candidate rollout configurationCase 3: Release candidate promotion to main release
Connector state
The connector
4.3.0
is a release candidate.The rollout orchestrator determined that the release candidate is stable and should be promoted to the main release.
The rollout orchestror triggers a workflow to promote the release candidate to the main release.
airbyte-ci connector publish
behaviorNot yet implemented
We could expose a
--promote
flag to theairbyte-ci connector publish
command to promote a release candidate to the main release.Example:
airbyte-ci connector --name=source-airtable publish --promote
If the connector is a release candidate it would:
latest
tag to DockerHub/metadata/airbyte/source-airtable/release_candidate/metadata.yaml
to/metadata/airbyte/source-airtable/latest/metadata.yaml
Expected registry state
The registry entry points to the
4.3.0
version.The registry entry has no
releases.releaseCandidate
field.Case 4: Rolling back a release candidate
Connector state
The connector
4.3.0
is a release candidate.The rollout orchestrator determined that the release candidate is not stable and should be rolled back.
airbyte-ci connector publish
behaviorNot yet implemented
We could expose a
--rollback
flag to theairbyte-ci connector publish
command to rollback a release candidate.Example:
airbyte-ci connector --name=source-airtable publish --rollback
If the connector is a release candidate it would:
4.3.0
tag from DockerHub/metadata/airbyte/source-airtable/release_candidate/metadata.yaml
and/metadata/airbyte/source-airtable/4.3.0/metadata.yaml
from the metadata serviceExpected registry state
The registry entry points to the
4.2.0
version.The registry entry has no
releases.releaseCandidate
field.[TO BE IMPLENTED] Case 5: Connector publish attempt when a release candidate is already published
Connector state
The connector
4.3.0
is a release candidate.A developer attempts to publish a new version
4.4.0
.airbyte-ci connector publish
behaviorThe command fails with an error message:
This is managed via metadata validation: the connector registry is fetched and if a release candidate is already published, the command fails.