Skip to content

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Mar 12, 2024

This introduces a capabilities field to the response of the /config endpoint, signaling what is being supported by the server.
The current capabilities that come to mind are:

  • pagination
  • scan-planning
  • views
  • vended-credentials
  • remote-signing
  • oauth2
  • sigv4

Capabilities like scan-planning / views / remote-signing / oauth2 would indicate that certain endpoints are implemented by the server and can be safely called by clients. Other capabilities would reflect using a certain query param or sending a particular header by a client to the server.

Using tags allows to group endpoints in the Swagger UI:

Screenshot 2024-06-20 at 11 08 03

For scan-planning it's not clear yet whether whether we'd like to only have scan-planning or whether there should also be scan-pre-planning, so it's best to introduce this capability as part of #9695 (cc @rahil-c)


tags:
- name: remote-signing
description: Requires server to support remote signing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielcweeks, @nastra, @jackye1995, if we add some form of capabilities to express what a REST service supports, then maybe we should merge the signer spec into the main REST catalog spec. Then everything is in one place and options are clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think it's probably good to merge those 2 specs together

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented elsewhere. It makes sense to merge them to me, but let's not do it in this PR. We can take care of it in a follow up.

@rdblue
Copy link
Contributor

rdblue commented Mar 12, 2024

@nastra, this is a great start. I think we also want to add a section in the overall API description that covers how tags are used when there is a matching capability. That is, capability tags represent optional functionality that must be declared by the service from the config route. To support a capability, all tagged routes must be implemented.

@jackye1995, for context, we are exploring ways to express optional parts of the API so that we can evolve it by adding new functionality and letting callers know about it.

@nastra nastra force-pushed the rest-capabilities branch 3 times, most recently from 81dc9d2 to c9492b3 Compare March 20, 2024 07:16
@nastra
Copy link
Contributor Author

nastra commented Mar 20, 2024

After some discussions, the capablities that we'd like to currently support are:

  • views
  • vended-credentials
  • remote-signing
  • multi-table-commit

For scan-planning it's not clear yet whether whether we'd like to only have scan-planning or whether there should also be scan-pre-planning, so it's best to introduce this capability as part of #9695 (cc @rahil-c)

@nastra nastra requested review from danielcweeks and rdblue March 20, 2024 07:23
@rahil-c
Copy link
Contributor

rahil-c commented Mar 20, 2024

capabilities field to the response of the /config endpoint, signaling what is being supported by the server.
For scan-planning it's not clear yet whether whether we'd like to only have scan-planning or whether there should also be scan-pre-planning, so it's best to introduce this capability as part of #9695 (cc

I would assume since the server is doing work on behalf of the client for both scan preplan and plan table that we would want to have both of these reflected in the capabilities field?
cc @rdblue @danielcweeks @jackye1995

so it's best to introduce this capability as part of #9695 (cc @rahil-c)

@nastra Also was wondering what you meant by this, did you want me to pick up these changes and include them into the preplan/plan pr?

@nastra
Copy link
Contributor Author

nastra commented Mar 20, 2024

@nastra Also was wondering what you meant by this, did you want me to pick up these changes and include them into the preplan/plan pr?

#9940 should go in before #9695 and as part of #9695 you would have to add the respective capabilities for scan planning

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got strong concerns about using enum here - special handling here and there, I think, that complicates things for adopters of any OpenAPI spec.

@geruh
Copy link
Contributor

geruh commented Mar 27, 2024

If I'm correct this approach provides a way to group catalog functionalities, and allow the server to explicitly define the support or not. This is kind of similar to what both the append and scan API's were trying to do with the feature flag configs.
However, there was one potential issue with using the config, as they could be incorrectly set upon catalog initialization, and the client may still make a request to an unsupported endpoint where a server doesn't explicitly set the flag to false in the getConfig response.

On that note if the server specifies no support for a feature, and we failing early on the client side?

@nastra
Copy link
Contributor Author

nastra commented Mar 28, 2024

If I'm correct this approach provides a way to group catalog functionalities, and allow the server to explicitly define the support or not. This is kind of similar to what both the append and scan API's were trying to do with the feature flag configs. However, there was one potential issue with using the config, as they could be incorrectly set upon catalog initialization, and the client may still make a request to an unsupported endpoint where a server doesn't explicitly set the flag to false in the getConfig response.

On that note if the server specifies no support for a feature, and we failing early on the client side?

In this case here we don't set feature flags with true/false. The existence of a capability in capabilities means that a server supports it. In the absence of a capability, a client would not call the respective endpoint or throw an errror.

For views there are cases where a client wouldn't call the endpoint (e.g. when renaming a table, it wouldn't also check if a view with the same name exists). In other cases the client would fail early, indicating that the server doesn't support X.

@nastra
Copy link
Contributor Author

nastra commented Apr 2, 2024

I've got strong concerns about using enum here - special handling here and there, I think, that complicates things for adopters of any OpenAPI spec.

@snazy we use enum in the OpenAPI spec to list possible values of a string, which is also what's being documented in https://swagger.io/docs/specification/data-models/enums/. The underlying type of a capability is still a string. Do you have an alternative in mind for the issue you're seeing? Also I believe when code is generated from the OpenAPI spec, there should be an option to generate enums as literals

@snazy
Copy link
Member

snazy commented Apr 2, 2024

I've got strong concerns about using enum here - special handling here and there, I think, that complicates things for adopters of any OpenAPI spec.

@snazy we use enum in the OpenAPI spec to list possible values of a string, which is also what's being documented in https://swagger.io/docs/specification/data-models/enums/. The underlying type of a capability is still a string. Do you have an alternative in mind for the issue you're seeing? Also I believe when code is generated from the OpenAPI spec, there should be an option to generate enums as literals

The issue at hand is that an enum cannot (by default) handle unknown values - which is generally fine. But here we have to expect that endpoints to return values that are not known by a client. (Not only generated) client's that did not handle this rather special case will fail to parse the result, which is a problem. I'm still in favor of just using a string.

@nastra nastra force-pushed the rest-capabilities branch 8 times, most recently from 3f70b55 to cfa7918 Compare June 20, 2024 09:16
@github-actions github-actions bot added the AWS label Jun 20, 2024
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using strings instead of an enum looks much better.
However, I've got concerns about the backwards compatibility stated in this specification.
It would also be good to have something written on the website how capabilities are meant to work, covering backwards compatibility - basically at least starting with a written documentation about Iceberg-REST (there is none on the website yet at all).

@nastra nastra force-pushed the rest-capabilities branch from cfa7918 to 15c769a Compare June 20, 2024 10:42
@findepi
Copy link
Member

findepi commented Jun 20, 2024

cc @nineinchnick

@nastra nastra force-pushed the rest-capabilities branch 2 times, most recently from 73cc54f to 898ae59 Compare June 27, 2024 14:37
@nastra nastra force-pushed the rest-capabilities branch 2 times, most recently from decf5ca to eaef2c6 Compare July 3, 2024 08:34
tags:
- Catalog API
- tables
- views
Copy link
Contributor

@jackye1995 jackye1995 Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this seems like a problem to me. I want to avoid overlapping tags if possible, because when this happens, you need to know "if I support tables, not views, do I support this API?". The answer here might be a simple yes, but for complex capabilities, it might become a case by case analysis and things become really complex. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be more clear, I am okay with for example LoadTable having both a tag for tables and credentials-vending since the second tag is dedicated to a specific feature in the API. But here, tables and views requires exactly the same thing, which is that this API exists. That is the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having tables/ views here simply means that if a catalog supports tablesor views then it needs to support this endpoint as you won't be able to do something with tables if namespaces aren't supported. See also #9940 (comment).

Also I probably wouldn't tag any endpoint with credentials-vending, since that capability doesn't require to implement certain endpoints.

@nastra nastra force-pushed the rest-capabilities branch 3 times, most recently from 4c76fa6 to abb7ed5 Compare July 9, 2024 13:08
- views
- remote-signing
- vended-credentials
- multi-table-commit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed in the last sync meeting that the capability is useful when it impacts the client's fallback behavior. In that case, it seems to me that only views is a necessary capability. What other capabilities here would impact fallback behavior?

@nastra nastra force-pushed the rest-capabilities branch 2 times, most recently from a03195d to 582aab0 Compare July 12, 2024 15:16
@github-actions github-actions bot removed the AWS label Jul 12, 2024
@nastra nastra force-pushed the rest-capabilities branch from 582aab0 to c77066b Compare July 17, 2024 08:43
@nastra
Copy link
Contributor Author

nastra commented Sep 11, 2024

closing this in favor of #10928

@nastra nastra closed this Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.