Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Low-Code CDK] Write the component schema and use it during manifest validation #20422

Merged
merged 52 commits into from
Dec 19, 2022

Conversation

brianjlai
Copy link
Contributor

@brianjlai brianjlai commented Dec 13, 2022

Closes #19759

What

Together Catherine, Maxime, and I handwrote json schema that documents all the available components in the low-code language and their respective relationships. This handwritten schema will then be used as the source of truth for the Pydantic models generated, a reference document for low-code developers, and hopefully the UI.

It is also incorporated into the manifest validation flow.

How

A majority of the work was writing the schema by hand. We went through this checklist to avoid duplicates and ensure that we had consistent formatting and styling for the YAML.

The other aspect was incorporating this into the validation flow. We read in the new schema YAML file and perform the same validator check against the incoming schema. The one nuance here is that because we want to deprecate the factory once we have finished the refactor, the new validation flow doesn't use any part of the existing factory.py to modify the incoming manifest. That means that we're not doing some of the behind the scenes preprocessing in order to adhere to the component schema.

This is handled in the form of the manifest_component_transformer.py, which adds certain fields to components so that it can be validate against our ideal schema:

  • Applies default types to components that are missing type and have a defined default value
  • Propagates $options dictionary to all subcomponents

Recommended reading order

  1. low_code_component_schema.yaml
  2. manifest_component_transformer.py
  3. manifest_declarative_source.py

brianjlai and others added 30 commits December 6, 2022 23:42
- ApiKeyAuthenticator
- BasicHttpAuthenticator
- BearerAuthenticator
- DeclarativeOauth2Authenticator
- NoAuth
…rm validation against the handwritten schema
@brianjlai
Copy link
Contributor Author

brianjlai commented Dec 19, 2022

/publish-cdk dry-run=true

https://github.com/airbytehq/airbyte/actions/runs/3729823953

@brianjlai
Copy link
Contributor Author

brianjlai commented Dec 19, 2022

/publish-cdk dry-run=true

🕑 https://github.com/airbytehq/airbyte/actions/runs/3729893236
https://github.com/airbytehq/airbyte/actions/runs/3729893236

@brianjlai
Copy link
Contributor Author

brianjlai commented Dec 19, 2022

/publish-cdk dry-run=false

🕑 https://github.com/airbytehq/airbyte/actions/runs/3729962091
https://github.com/airbytehq/airbyte/actions/runs/3729962091

@brianjlai
Copy link
Contributor Author

brianjlai commented Dec 19, 2022

/publish-cdk dry-run=true

🕑 https://github.com/airbytehq/airbyte/actions/runs/3734712600
https://github.com/airbytehq/airbyte/actions/runs/3734712600

@brianjlai
Copy link
Contributor Author

brianjlai commented Dec 19, 2022

/publish-cdk dry-run=false

🕑 https://github.com/airbytehq/airbyte/actions/runs/3734807069
https://github.com/airbytehq/airbyte/actions/runs/3734807069

@brianjlai
Copy link
Contributor Author

/approve-and-merge reason="This version of the airbyte-cdk has already been released (w/ a patch fix also done this morning) so merging this is a formality more than a release"

@octavia-approvington
Copy link
Contributor

This is really good
simply the best

@octavia-approvington octavia-approvington merged commit b7113a2 into master Dec 19, 2022
@octavia-approvington octavia-approvington deleted the low_code_handwritten_component_schema branch December 19, 2022 20:42
@franzwilhelm
Copy link

@octavia-approvington @brianjlai I started getting issues in my local low code development workflow after this push. Running https://pypi.org/project/airbyte-cdk/0.16.1/ currently.
With this config in low-code:

authenticator:
      type: "OAuthAuthenticator"
      token_refresh_endpoint: "{{ config['token_refresh_endpoint'] }}"
      client_id: "{{ config['client_id'] }}"
      client_secret: "{{ config['client_secret'] }}"
      refresh_token: ""
      refresh_request_body: 
        audience: "{{ config['graphql_endpoint'] }}"
      grant_type: "client_credentials"

This is output from python main.py spec

Traceback (most recent call last):
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 155, in _validate_source
    validate(propagated_manifest, declarative_component_schema)
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/jsonschema/validators.py", line 934, in validate
    raise error
jsonschema.exceptions.ValidationError: 'OAuthAuthenticator' is not one of ['ApiKeyAuthenticator']

Failed validating 'enum' in schema[0]['properties']['type']:
    {'enum': ['ApiKeyAuthenticator'], 'type': 'string'}

On instance['type']:
    'OAuthAuthenticator'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/main.py", line 12, in <module>
    source = SourceGraphql()
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/source_graphql/source.py", line 18, in __init__
    super().__init__(**{"path_to_yaml": "graphql.yaml"})
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/yaml_declarative_source.py", line 21, in __init__
    super().__init__(source_config, debug)
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 62, in __init__
    self._validate_source()
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 157, in _validate_source
    raise ValidationError("Validation against json schema defined in declarative_component_schema.yaml schema failed") from e
jsonschema.exceptions.ValidationError: Validation against json schema defined in declarative_component_schema.yaml schema failed

But as soon as i remove the refresh_token key from my config I get this:

Traceback (most recent call last):
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/main.py", line 12, in <module>
    source = SourceGraphql()
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/source_graphql/source.py", line 18, in __init__
    super().__init__(**{"path_to_yaml": "graphql.yaml"})
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/yaml_declarative_source.py", line 21, in __init__
    super().__init__(source_config, debug)
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 62, in __init__
    self._validate_source()
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 131, in _validate_source
    streams = [self._factory.create_component(stream_config, {}, False)() for stream_config in self._stream_configs()]
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 131, in <listcomp>
    streams = [self._factory.create_component(stream_config, {}, False)() for stream_config in self._stream_configs()]
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 129, in create_component
    return self.build(
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 146, in build
    updated_kwargs = {k: self._create_subcomponent(k, v, kwargs, config, class_, instantiate) for k, v in kwargs.items()}
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 146, in <dictcomp>
    updated_kwargs = {k: self._create_subcomponent(k, v, kwargs, config, class_, instantiate) for k, v in kwargs.items()}
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 209, in _create_subcomponent
    return self.create_component(definition, config, instantiate)()
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 129, in create_component
    return self.build(
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 146, in build
    updated_kwargs = {k: self._create_subcomponent(k, v, kwargs, config, class_, instantiate) for k, v in kwargs.items()}
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 146, in <dictcomp>
    updated_kwargs = {k: self._create_subcomponent(k, v, kwargs, config, class_, instantiate) for k, v in kwargs.items()}
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 209, in _create_subcomponent
    return self.create_component(definition, config, instantiate)()
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 129, in create_component
    return self.build(
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 146, in build
    updated_kwargs = {k: self._create_subcomponent(k, v, kwargs, config, class_, instantiate) for k, v in kwargs.items()}
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 146, in <dictcomp>
    updated_kwargs = {k: self._create_subcomponent(k, v, kwargs, config, class_, instantiate) for k, v in kwargs.items()}
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 200, in _create_subcomponent
    return self.create_component(definition, config, instantiate)()
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 129, in create_component
    return self.build(
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/declarative/parsers/factory.py", line 167, in build
    validate(component_definition, schema)
  File "/Users/franz/Programming/aco.airbyte-cdk/airbyte-integrations/connectors/source-graphql/.venv/lib/python3.9/site-packages/jsonschema/validators.py", line 934, in validate
    raise error
jsonschema.exceptions.ValidationError: 'refresh_token' is a required property

Failed validating 'required' in schema['allOf'][1]:
    {'properties': {'_token_expiry_date': {},
                    'access_token_name': {'anyOf': [{'$ref': '#/definitions/InterpolatedString'},
                                                    {'type': 'string'}],
                                          'default': 'access_token'},
                    'client_id': {'anyOf': [{'$ref': '#/definitions/InterpolatedString'},
                                            {'type': 'string'}]},
                    'client_secret': {'anyOf': [{'$ref': '#/definitions/InterpolatedString'},
                                                {'type': 'string'}]},
                    'config': {'type': 'object'},
                    'expires_in_name': {'anyOf': [{'$ref': '#/definitions/InterpolatedString'},
                                                  {'type': 'string'}],
                                        'default': 'expires_in'},
                    'grant_type': {'anyOf': [{'$ref': '#/definitions/InterpolatedString'},
                                             {'type': 'string'}],
                                   'default': 'refresh_token'},
                    'refresh_request_body': {'type': 'object'},
                    'refresh_token': {'anyOf': [{'$ref': '#/definitions/InterpolatedString'},
                                                {'type': 'string'}]},
                    'scopes': {'items': {'type': 'string'},
                               'type': 'array'},
                    'token_expiry_date': {'anyOf': [{'$ref': '#/definitions/InterpolatedString'},
                                                    {'type': 'string'}]},
                    'token_expiry_date_format': {'type': 'string'},
                    'token_refresh_endpoint': {'anyOf': [{'$ref': '#/definitions/InterpolatedString'},
                                                         {'type': 'string'}]}},
     'required': ['token_refresh_endpoint',
                  'client_id',
                  'client_secret',
                  'refresh_token',
                  'config'],
     'type': 'object'}

On instance:
    {'$options': {'name': 'graphql', 'path': '/'},
     'client_id': "{{ config['client_id'] }}",
     'client_secret': "{{ config['client_secret'] }}",
     'config': {},
     'grant_type': 'client_credentials',
     'name': 'graphql',
     'path': '/',
     'refresh_request_body': {'audience': "{{ config['graphql_endpoint'] "
                                          '}}'},
     'token_refresh_endpoint': "{{ config['token_refresh_endpoint'] }}"}

Could you help me out here, or should I submit this as an issue?

@brianjlai
Copy link
Contributor Author

hey @franzwilhelm thanks for pointing this out! This is an issue on our end and I see what the problem is. I'll start working on a fix for this

@franzwilhelm
Copy link

Workaround for now was to pin the airbyte cdk to 0.15.0. Thanks!

MAIN_REQUIREMENTS = [
    "airbyte-cdk~=0.15.0",
]

@brianjlai
Copy link
Contributor Author

hey @franzwilhelm I just pushed an update. Can you try 0.16.2 and see if that gets rid of your issue in the authenticator?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Improvements or additions to documentation CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Low-code CDK] Write the manifest component YAML schema
8 participants