Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support well-defined ComponentDefinition return values for catalog connectors #2492

Merged
merged 11 commits into from
Mar 1, 2022

Conversation

kiersten-stokes
Copy link
Member

@kiersten-stokes kiersten-stokes commented Feb 18, 2022

Fixes #2460
Supports #2488

What changes were proposed in this pull request?

Defines a new EntryData object class that includes the definition string as well as any runtime-specific attributes (e.g. package_name for AirflowEntryData). A new function, get_entry_data is defined, which takes the place of read_catalog_entry, and returns an EntryData object. If no get_entry_data function is present for a given CCC, the return value of read_catalog_entry is manually coerced into a EntryData object.

A new CatalogEntry object is also defined, which holds in its bag of properties the EntryData object above, a unique id, information on the source of the component, and some catalog information (such as the relevant categories). This object is constructed after the EntryData object is returned by the CCC author and is passed directly to the parse methods, eliminating the need to define a SimpleNamespace object as a middleman.

There have also been some minor changes to the Component object, namely the inclusion of an optional package_name attribute and changing the same of the source_identifier attribute to component_reference.

During processing of Airflow components, the package_name for that component is now accessed directly (using the import_statement property) if it is available. This eliminates the need to manually change the available_airflow_operators configurable trait to include import statements for all custom components.

Changes are backwards compatible, so any implementations of CCC's using the previous version that doesn't return the well-defined object will still work properly (e.g. the url catalog). Built-in CCC's will have to be updated (in a separate PR) to use the get_entry_data.

For now, here is what you will need to do (@ptitzler) to implement the new metadata return values:

  • Change the name of the read_catalog_entry function to get_entry_data (parameters are the same)
  • The new return value is an EntryData object, and more specifically for the Airflow package connector use-case, an AirflowEntryData object, so:
    • Add the import statement (duh) from elyra.pipeline.catalog_connector import AirflowEntryData and from elyra.pipeline.catalog_connector import EntryData
    • Change type hint return value to Optional[EntryData]
    • The package name string should not include the class name (which should be the natural behavior)
    • Finally, instead of returning a string value, you'll now want to initialize an AirflowEntryData and return that instead:
return AirflowEntryData(definition=<definition string>, package_name=<package name string>)
  • Convert get_hash_keys to a classmethod by adding the decorator and changing the parameter from self to cls

Initial manual tests following the above structure are working.

How was this pull request tested?

Tests have been updated.
Successful and error scenarios have been manually tested.

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@kiersten-stokes kiersten-stokes added the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Feb 18, 2022
@kiersten-stokes kiersten-stokes added this to the 3.7.0 milestone Feb 18, 2022
@elyra-bot
Copy link

elyra-bot bot commented Feb 18, 2022

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

@kevin-bates
Copy link
Member

I think get_component_definition() should Return a ComponentDefinition instance, assuming there isn't such a class defined already.

We should also note the read_catalog_entry() is deprecated and will be removed in 4.0 and existing component connectors will continue to work until then.

@ptitzler
Copy link
Member

I've confirmed that this is working for the airflow package connector. #2488 contains test details.

Change type hint return value to CatalogEntry; note that it is not optional, so even if you aren't able to retrieve a definition, you will still return a CatalogEntry object (it will just have definition=None)

Could we make it optional? Seems more natural ...

@kevin-bates
Copy link
Member

Could we make it optional? Seems more natural ...

We can make the return value optional. However, it does introduce some ambiguity for backward-compatibility handling since we'll need to call the old method if the new method returns None because we're unable to determine if the new method is the user's implementation or the base class's (because we can't use an abstract method approach). And we'll need to test the existence of the old method prior to calling it as new implementations won't have implemented it. This is necessary if we want to release these changes within a major release.

@kiersten-stokes kiersten-stokes changed the title Support well-defined ComponentEntry return values for catalog connectors Support well-defined ComponentDefinition return values for catalog connectors Feb 21, 2022
@kiersten-stokes kiersten-stokes marked this pull request as ready for review February 21, 2022 22:33
@kiersten-stokes kiersten-stokes removed the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Feb 21, 2022
Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - got a few comments mostly around naming but I'm hoping we can make things easier for connector authors as well. I'd be happy to meet via Webex to go over things if that is easier.

elyra/pipeline/catalog_connector.py Outdated Show resolved Hide resolved
elyra/pipeline/catalog_connector.py Outdated Show resolved Hide resolved
elyra/pipeline/catalog_connector.py Outdated Show resolved Hide resolved
elyra/pipeline/catalog_connector.py Outdated Show resolved Hide resolved
elyra/pipeline/catalog_connector.py Outdated Show resolved Hide resolved
elyra/pipeline/catalog_connector.py Outdated Show resolved Hide resolved
elyra/pipeline/component.py Show resolved Hide resolved
elyra/pipeline/airflow/component_parser_airflow.py Outdated Show resolved Hide resolved
@kiersten-stokes
Copy link
Member Author

Resolving several review comments above after offline design discussion. Will update PR description to reflect new design once changes are pushed

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great @kiersten-stokes - thank you.

@ptitzler
Copy link
Member

The connectors in elyra-ai/examples#94 work as expected with this PR.

Copy link
Member

@ptitzler ptitzler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update the catalog connector's 'read_catalog_entry' signature to accept optional metadata
4 participants