-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pypi section to connector metadata #33529
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we this nest down under a something like remotePackageIndex
, which have field for pypi / maven / npm whatever
Thanks for the review @alafanechere , could you take another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd appreciate an additional review from @bnchrch .
Can we also add a validator to make sure that Pypi
is only used for python connector? (A metadata validation for a java connector would fail if your new fields are declared in there)
extra = Extra.forbid | ||
|
||
enabled: bool | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding the url to the pypi package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems in line with the "documentation url" and so on, added!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since our goal here is to publish, and tooling generally expects package name, could we call this the PyPi package_name
? I think that would be my vote. (E.g. airbyte-source-apify
instead of https://pypi.org/project/airbyte-source-apify/
)
PyPi URL just isn't very helpful because (to my knowlege at least), the full URL can't be passed to pip install or to the publish operation.
Fwiw, I do see "pip url" used frequently, but that is generally an alias for a PyPi package name or a Git ref, if the package isn't on PyPi.
|
||
pypi: Optional[Pypi] = None | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we go "polymorphic" here?
pypi: Optional[Pypi] = None | |
index_name: # Would be an enum like Pypi | Maven etc | |
url: str # Would be the url to the package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we will need different implementations for each of the package indices and maybe different options a while down the road, an explicit separate object seems more future proof, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, what do you think about using RemoteRegistries
instead of RemotePackageIndexes
:
remoteRegistries:
pypi:
url: <url to pypi package>
DockerHub:
url: <url to dockerHub images>
imageAdress: dockerio.
maven:
...
Dockerhub, maven etc. can come later of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion on the wording, changing to remoteRegistries
Good point, added that. It's checking for the language tag |
extra = Extra.forbid | ||
|
||
enabled: bool | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since our goal here is to publish, and tooling generally expects package name, could we call this the PyPi package_name
? I think that would be my vote. (E.g. airbyte-source-apify
instead of https://pypi.org/project/airbyte-source-apify/
)
PyPi URL just isn't very helpful because (to my knowlege at least), the full URL can't be passed to pip install or to the publish operation.
Fwiw, I do see "pip url" used frequently, but that is generally an alias for a PyPi package name or a Git ref, if the package isn't on PyPi.
from pydantic import BaseModel, Extra, Field | ||
|
||
|
||
class Pypi(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class Pypi(BaseModel): | |
class PyPi(BaseModel): |
class Config: | ||
extra = Extra.forbid | ||
|
||
pypi: Optional[Pypi] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pypi: Optional[Pypi] = None | |
pypi: Optional[PyPi] = None |
Makes sense to me @aaronsteers - adjusted. |
enabled: bool | ||
packageName: str = Field(..., description="The name of the package on PyPi.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a followup PR once the publish to PyPi step is implemented:
- We should add a validator that makes sure that if
enabled is True
the current connector version is available on PyPi.
It's basically what we do for docker images: before uploading the metadata file to GCS we validate that the docker image is available on DockerHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean as a post-upload validator? If yes that makes a ton of sense to me, thanks for mentioning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes when we run metadata_service validate <config file path>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to not be blocking - 👍 Feel free to add validation that PyPi
is only set for python connectors in a follow up PR or this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just got caught up on all this.
Great and practical discussion.
Ive got nothing more to add besides LGTM!
@@ -151,3 +151,19 @@ The supported scope types are listed below. | |||
| Scope Type | Value Type | Value Description | | |||
|------------|------------|------------------| | |||
| stream | `list[str]` | List of stream names | | |||
|
|||
#### `remoteRegistries` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💎
@@ -171,12 +171,29 @@ def validate_metadata_base_images_in_dockerhub( | |||
return True, None | |||
|
|||
|
|||
def validate_pypi_only_for_python( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💎
To control the publishing of python connectors to pypi, this PR introduces a new flag to opt into this publishing.
The actual publishing logic as part of airbyte-ci will be implemented in a separate PR.