Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL] Core <> Spec/Protobuf CI/CD #653

Open
Tracked by #16787
amberzsy opened this issue Nov 4, 2024 · 5 comments
Open
Tracked by #16787

[PROPOSAL] Core <> Spec/Protobuf CI/CD #653

amberzsy opened this issue Nov 4, 2024 · 5 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@amberzsy
Copy link
Contributor

amberzsy commented Nov 4, 2024

What/Why

What are you proposing?

There are bunch gaps for having E2E protobuf and API spec release:

  1. Open api spec not in sync with core API and lack of automation and enforcement
  2. Lack of validation API spec
  3. Unclear on Source of truth

Proposal

Source of truth

  • Source of truth is the spec, not protobuf
  • All OS clients (java, go, python, etc) will continue to be generated from the API Specs. In the future the OS clients will support both HTTP + GRPC transparently

Development flow

  1. The protos will reside in the same repository as the current API specs: opensearch-api-specification.

Snip20241114_2
2. The script to convert from specs to protobufs will also live in the opensearch-api-specification repository, but it will make calls to other repositories (for example the fork of opensearch-project/openapi-generator).
3. The opensearch-api-specification will be imported as a git submodule to the opensearch repo, where the Protobufs will need to be used.
4. If adding a new GRPC endpoint that is already implemented in the HTTP protocol:

  • Checkout the spec repo locally on your own branch/fork
  • In api-spec repo: Run spec to protobuf conversion script locally, to generate protobuf definition.
  • In core repo: 1) Update to point to your own branch 2) 'git submodule pull'
  • update server side code change
    Snip20241114_3
  1. On a PR to the opensearch repo, developer is required to provide the branchname that their spec changes are on (in the opensearch-api-specification repo).
  2. Then there would new CI jobs running to:
  • Workflow 1: Validation for this PR: CI runs the job on that branch to generate the protos, and use it for this PR (run unit tests/ integration tests).
  • Scenarios 1) If PR is only for GRPC, validate that the HTTP endpoint also implemented OR add annotation that this is GRPC-only and never HTTP .
  • Scenarios 2) If PR is only for HTTP, validate that the protos are generated by also have “not supported” comment.
  • Scenarios 3) if for both grpc and http (which should be the most cases), validate and run integration/ut for both http and grpc.
  1. To merge the PR, instead of clicking merge on Github, instead a maintainer would comment the command “bot merge”.
  • Workflow 2: Merging this PR -> "bot merge", which will trigger a workflow/job
    -- “Merge spec to main”
    -- Update the submodule in this PR with the latest spec main and protos
    -- Git merge
    (Problem: “accidentally click the merge button” -> maintainer will have to be aware )

Snip20241114_5

@dblock
Copy link
Member

dblock commented Nov 5, 2024

Thanks for opening this.

I think we should resolve what the source of truth for OpenSearch would be with this proposal. My understanding was (still is :)) that the spec in this repo is the OpenSearch API truth. While today it's reverse engineering the existing OpenSearch API because of the legacy, we see a future where the OpenSearch server-side API is generated from spec as well. If this is true, then protobufs are just another output from the source of truth, aka a build of API spec in this repo that outputs .proto files.

WDYT?

I opened #655 to express this.

@dblock dblock added enhancement New feature or request question Further information is requested and removed untriaged labels Nov 5, 2024
@amberzsy
Copy link
Contributor Author

amberzsy commented Nov 6, 2024

Thanks for opening this.

I think we should resolve what the source of truth for OpenSearch would be with this proposal. My understanding was (still is :)) that the spec in this repo is the OpenSearch API truth. While today it's reverse engineering the existing OpenSearch API because of the legacy, we see a future where the OpenSearch server-side API is generated from spec as well. If this is true, then protobufs are just another output from the source of truth, aka a build of API spec in this repo that outputs .proto files.

WDYT?

I opened #655 to express this.

open-api-spec in general is for REST/JSON, that's why propose have separate repo for proto. but not strong opinionated on this. i think we can start with keeping in same repo. But we still need to close the gap that, for protobuf, server side development has to define the protobuf first, which is unlike json, which we can do reverse engineering. That's why introduce the development flow above.

reg if spec the source of truth. imo, partially yes. for API schema, most likely, but for rpc etc, it would be different. another example would be, grpc response is very different from http. so most of response we cannot adopt or directly convert from spec. Or, .... we can make spec can support both protobuf and json with certain limitation/restriction etc.

Also, actually, let me break into separate proposals.

  1. proposal on how we can keep core <> spec in sync. (how and plan to fix the existing and stop gap).
  2. proposal on the protobuf conversion. (include tooling and the rules we build).
  3. proposal on protobuf release/development cycle. (maybe can merge into 1. )

@dblock
Copy link
Member

dblock commented Nov 6, 2024

I suppose I am still stuck in the idea that the spec is the only API, but it sounds like you want the Protobuf spec to be potentially very different? Let's think from a developer POV (related to 1), if I am adding a new API, say "_refresh", how do you see one do it in a world where it's both a REST API and a protobuf API?

@guptashubham
Copy link

i agree with DB that spec is the only source of truth. However, spec needs to evolve as it will require versioning support (with the binary) among other things. I'd also expect spec to be the source of truth for API documentation and we can build it over time.

@amberzsy amberzsy changed the title [PROPOSAL] Protobuf Automation [PROPOSAL] Core <> Spec/Protobuf Automation Nov 15, 2024
@amberzsy
Copy link
Contributor Author

amberzsy commented Nov 15, 2024

updated the proposal.

I suppose I am still stuck in the idea that the spec is the only API, but it sounds like you want the Protobuf spec to be potentially very different?

per our offline discussion. the Spec would be the SoT. All clients and Probobuf would be generated from Specs. for grpc response, we would do some loose mapping based on https://grpc.io/docs/guides/status-codes/. (so far looks doable. )

Let's think from a developer POV (related to 1), if I am adding a new API, say "_refresh", how do you see one do it in a world where it's both a REST API and a protobuf API?

mentioned above. basically, develop would first update the spec with new API definition, certain scrip would auto generate the protobuf definition. In the PR process, if prefer have http server impl first, it can configure with HTTP only. so it would still generate the proto but have annotation with "grpc unsupported". same process for grpc impl only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants