Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for 'file' URI scheme to URL-based connectors #2873

Merged
merged 7 commits into from
Aug 11, 2022

Conversation

ptitzler
Copy link
Member

@ptitzler ptitzler commented Aug 9, 2022

Closes #2872

What changes were proposed in this pull request?

The built-in URL catalog connector, Apache Airflow package connector, and the Apache Airflow provider package connectors now support input of a URL that uses the file URI scheme in addition to the already supported HTTP and HTTPS schemes. No changes to the UI or validation are required.

The behavior is as follows:

  • no changes if the user enters an HTTP-based URL
  • user enters a URL that identifies an existing local file name (e.g. file://path/to/file) -> HTTP status code 200
  • user enters a URL that identifies an existing local directory name (e.g. file://i/am/a/dir) -> HTTP status code 400 (invalid input)
  • user enters a URL that identifies a non-existing local file name (e.g. file://no/such/file) -> HTTP status code 404 (not found)
  • URL connector: If a file-scheme URL is entered, the pipeline file is not portable because the complete URL is stored (file://path/to/file) just like it is done for the filesystem connector when bo base directory is specified.
  • Airflow package connector: If a file-scheme URL is entered, the pipeline file is portable because the file path is not stored. (The stored metadata is identical to the metadata that is stored if an HTTP-based URL is entered. )
  • Airflow provider package connector: If a file-scheme URL is entered, the pipeline file is portable because the file path is not stored. (The stored metadata is identical to the metadata that is stored if an HTTP-based URL is entered.)

How was this pull request tested?

  • Reviewed the output of make docs (user_guide/pipeline-components.html)
  • Added/updated server tests for new utility function/connectors
  • Tested the following scenarios
    • URL connector
      • configure connector using valid file://absolute/path/to/file.whl URL
      • configure connector using invalid file://absolute/path/to/no-such-file.whl URL (specifying a file that does not exist)
      • configure connector using invalid file://absolute/path/to/dir URL (specifying an existing directory)
    • Apache Airflow package catalog connector
      • configure connector using valid file://absolute/path/to/file.whl URL
      • configure connector using invalid file://absolute/path/to/no-such-file.whl URL (specifying a file that does not exist)
      • configure connector using invalid file://absolute/path/to/dir URL (specifying an existing directory)
    • Apache Airflow provider package catalog connector
      • configure connector using valid file://absolute/path/to/file.whl URL
      • configure connector using invalid file://absolute/path/to/no-such-file.whl URL (specifying a file that does not exist)
      • configure connector using invalid file://absolute/path/to/dir URL (specifying an existing directory)

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@elyra-bot
Copy link

elyra-bot bot commented Aug 9, 2022

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

@ptitzler ptitzler added kind:enhancement New feature or request component:catalog connectors Access to component catalogs labels Aug 9, 2022
@ptitzler ptitzler added this to the 3.11.0 milestone Aug 9, 2022
Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Is it possible to add a test or two for this? Are there any doc adjustments necessary?

@akchinSTC akchinSTC linked an issue Aug 9, 2022 that may be closed by this pull request
@ptitzler
Copy link
Member Author

This looks great. Is it possible to add a test or two for this? Are there any doc adjustments necessary?

Done! I've added server tests for the connectors and the new utility library. I've also updated the documentation.

@ptitzler ptitzler requested a review from kevin-bates August 10, 2022 09:07
@ptitzler ptitzler changed the title Add support for 'file' scheme to URL-based connectors Add support for 'file' URI scheme to URL-based connectors Aug 10, 2022
Copy link
Member

@kiersten-stokes kiersten-stokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Super valuable for the Airflow packages - I noticed a big difference in load time

docs/source/user_guide/pipeline-components.md Outdated Show resolved Hide resolved
docs/source/user_guide/pipeline-components.md Outdated Show resolved Hide resolved
docs/source/user_guide/pipeline-components.md Outdated Show resolved Hide resolved
Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the minor doc updates/suggestions, this looks great. Transport adapters are very cool!

ptitzler and others added 3 commits August 11, 2022 06:47
Co-authored-by: Kiersten Stokes <kierstenstokes@gmail.com>
Co-authored-by: Kiersten Stokes <kierstenstokes@gmail.com>
@ptitzler ptitzler merged commit a83c3aa into elyra-ai:main Aug 11, 2022
@ptitzler ptitzler deleted the file-url-scheme branch August 16, 2022 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:catalog connectors Access to component catalogs kind:enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Component catalog connectors: consider supporting the 'file' scheme
3 participants