Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stubs package prototype #472

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Stubs package prototype #472

wants to merge 1 commit into from

Conversation

honno
Copy link
Member

@honno honno commented Aug 15, 2022

Ok, the idea of this PR is to maybe resolve #411. A few things have come up that might make folk think ultimately a first-party package is a bad idea.

The ideal scenario would be uploading ./spec/API_specification/array_api/ as-is to PyPI, given it's a importable package, but as it is right now, the draft and 2012.12 packages/namespaces would look quite differently:

  • Package name has changed (signatures -> array_api)
  • Whether method signatures have the self param or not
  • __init__.py changes means importing looks different

Practically this is annoying for tooling built to work with all spec versions (i.e. array-api-tests), as those tools will need to write compatibility layers. Even if it'd be possible to update the spec-at-tags retroactively, that seems precarious given how we might have to play around with Sphinx too.

IMO then, having a package where we essentially write the compatibility layers is the best solution if we want consistency in how we access stubs across spec versions. Not only that, we can make accessing stubs quite ergonomic for end users by providing say a well-defined SimpleNamespace of the stubs for all spec versions—see the introduced array_api_stubs.py for what I envision it could look like... currently only supports draft, latest/2012.12 is TODO.

Note in terms of packaging, I'm not religious to how things are structured now, like I'd probably move array_api_stubs.py to its own folder containing a setup.py, and need to experiment with package data so this package is actually portable. Generally haven't played around with a release workflow yet, what I'd envision is a GitHub action that would push to PyPI automatically on PR merges, which looks fairly do-able (thank god for test PyPI 😅). Haven't properly tried this yet given y'all might not want a stubs package now...? And of-course please let me know if I'm missing something.

@asmeurer
Copy link
Member

Personally what I would do is:

  • Move the signatures to the top-level of this repo. Or to a new directory at the top-level called signatures.
  • Make the current signatures a submodule of the array_api package.
  • Copy the 2021 signatures as a separate submodule.

Something like

array_api/
  draft/
    __init__.py
    linalg.py
    _functions.py
    ...
  2021.12/
    __init__.py
    linalg.py
    _functions.py

If we want to include the helper categories, we need to put them somewhere that it is clear that they are not part of the array_api namespace itself.

@leofang
Copy link
Contributor

leofang commented Aug 15, 2022

The v2021 branch needs to be updated by backporting #430. If @kgryte does not have time I can give it a shot, but not this week.

@honno
Copy link
Member Author

honno commented Aug 16, 2022

Well we currently git tag releases, as opposed to have branches (no idea how git works here, but that's an important distinction I'd gather).

As stated above, I'm not sure we want to go through the hassle of meta-standardising every little thing about the spec? Which is why I propose we write compatibility layers instead, where we can metaphorically and literally codify what the changes are. I am also keen on making the package ergonomic to use, which just uploading the spec as-is is not.

@leofang
Copy link
Contributor

leofang commented Aug 16, 2022

Sorry for digression, apparently I misunderstood... But if we only rely on git tags, how do we generate docs for each version, especially when a fix needs to be applied to all versions?

@asmeurer
Copy link
Member

Well we currently git tag releases, as opposed to have branches (no idea how git works here, but that's an important distinction I'd gather).

That's why we have no choice but to copy over the spec.

As stated above, I'm not sure we want to go through the hassle of meta-standardising every little thing about the spec?

What do you mean "meta-standardising"?

Which is why I propose we write compatibility layers instead, where we can metaphorically and literally codify what the changes are. I am also keen on making the package ergonomic to use, which just uploading the spec as-is is not.

How is uploading the spec as-is not ergonomic?

Sorry for digression, apparently I misunderstood... But if we only rely on git tags, how do we generate docs for each version, especially when a fix needs to be applied to all versions?

That's a good point. I thought we discussed this, but I can't find where. I'm also surprised that we are just using tags instead of branches, for this same reason.

@honno
Copy link
Member Author

honno commented Aug 17, 2022

What do you mean "meta-standardising"?

Ahah, like standardising how we write the Array API standard 😅 The effort wouldn't just be in fixing the signatures, but making sure the respective Sphinx docs work correctly too.

How is uploading the spec as-is not ergonomic?

I like there being one source of truth, the actual signatures folder, and putting effort to "meta-standardise" that is reasonable... I just think it doesn't give us any flexibility to make something usable for any tooling. Any tooling would need to write what's in this PR anyway (similar to stubs.py) to actually get utility by organising the stubs in a sane way.

@asmeurer
Copy link
Member

asmeurer commented Aug 17, 2022

Maybe I'm missing something here, but how are you actually planning to package the stubs into a package? What you have now is just a file that references the actual stubs package, but it will need to be packaged up somehow.

My suggestion (moving the stubs to the top-level so that they can be package more easily) is a way to do that. I agree that having some extra variables in the package with different categories and so on could be useful, but I think it's less important than the actual packaging of the stubs themselves, so I would focus on that first. Again, I think my suggestion at #472 (comment) would do it. Then something like this file could be included, like array_api/metadata.py, for people who want to access the namespace in a more structured way.

I like there being one source of truth, the actual signatures folder, and putting effort to "meta-standardise" that is reasonable... I just think it doesn't give us any flexibility to make something usable for any tooling. Any tooling would need to write what's in this PR anyway (similar to stubs.py) to actually get utility by organising the stubs in a sane way.

There will still only be one source of truth. Basically what I'm suggesting is to run

mkdir -p stubs/array_api
touch stubs/setup.py
git mv spec/API_specification/signatures array_api/latest
ln -s array_api/latest spec/API_specification/signatures

(or alternatively replace the last command with updating the path in conf.py). This does also require copying the signatures from the 2021.12 tag to array_api/2021.12/, but I don't see any way around that since we really want everything to be together in one package (splitting the stubs across different versions of the package would be a bad idea IMO).

It really would have been cleaner if the spec versions were split by different subdirectories in this repo (and that's still something we can do).

@honno
Copy link
Member Author

honno commented Aug 18, 2022

but how are you actually planning to package the stubs into a package?

I hadn't worked it out yet, but I'm sure it's possible with pulling repos-at-tags as package data.

There will still only be one source of truth. Basically what I'm suggesting is to run

To clarify, I'm saying I don't think one source of truth in the repo is worth the effort, and not necessary. Any downstream tooling would need to write what's in this PR anyway (similar to stubs.py) to actually get utility by organising the stubs in a sane way.

@asmeurer
Copy link
Member

I still maintain that we should do the refactor I've suggested. You seem to be working off the assumption that the current status quo (signatures inside of the Sphinx directory for the current draft version only) is a good one, and that we should try to work around it. But I think don't think this is actually the case. The structure I described where the signatures are a first class package separate from the Sphinx directory is much simpler.

I hadn't worked it out yet, but I'm sure it's possible with pulling repos-at-tags as package data.

This sounds unnecessarily complicated. Why not just make the current signatures a package? They already basically are, just not in a location that makes them easily accessible. Any additional tooling on top of that can be further files within the same package.

Any downstream tooling would need to write what's in this PR anyway (similar to stubs.py) to actually get utility by organising the stubs in a sane way.

Just to be clear, I'm fine with the separate file that organizes things. My suggested structure is

stubs/
  setup.py
  array_api/ # This is the top-level package that gets shipped to pypi
    __init__.py # This would contain references to the different versions of the spec. 
    metadata.py # This the categories and stuff that are in your array_api_stubs.py
    draft/ # This is the current spec/API_specification/array_api
       __init__.py
       _functions.py
       _array.py
       ...
    2021.12/ # This is the spec/API_specification/signatures in the 2021.12 branch, except updated with some of the cleanups that have been made since then (like `self` param)
spec/
  API_specification/
    conf.py # Updated to point to ../../stubs/array_api/draft

@honno
Copy link
Member Author

honno commented Aug 19, 2022

You seem to be working off the assumption that the current status quo (signatures inside of the Sphinx directory for the current draft version only) is a good one, and that we should try to work around it. But I think don't think this is actually the case. The structure I described where the signatures are a first class package separate from the Sphinx directory is much simpler.

Any additional tooling on top of that can be further files within the same package.

Well rather my thinking was there's friction in changing the current way, given small changes require finnicky changes to the Sphinx docs. But yea thanks now I see we can provide a "public API" to access the stubs in a standardised way, in the top-level namespace.

My suggested structure is...

Happy to experiment with this and see changing our Sphinx config and docs isn't too annoying (mind I'm off for the next two weeks).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plans to publish signatures to pypi?
3 participants