Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openapi: mark object as actually base64 #2091

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

woodruffw
Copy link
Member

@woodruffw woodruffw commented Apr 18, 2024

This is a psuedo-fix: format: byte is not actually valid for type: object per Swagger 2.0, but other fields in the OpenAPI schema for Rekor use this convention (e.g. LogEntry.attestation) and our downstream tooling (in sigstore-apis) understands it and knows how to work around it.

This corrects body from type: object to type: string, and removes format: byte where misleading (i.e. from type: object in LogEntry.attestation).

Edit: downstream hotfix: trailofbits/sigstore-apis#5

Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw woodruffw requested a review from a team as a code owner April 18, 2024 23:25
Copy link

codecov bot commented Apr 18, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 48.92%. Comparing base (488eb97) to head (2377d73).
Report is 89 commits behind head on main.

Files Patch % Lines
pkg/api/entries.go 0.00% 4 Missing ⚠️
cmd/rekor-cli/app/get.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2091       +/-   ##
===========================================
- Coverage   66.46%   48.92%   -17.54%     
===========================================
  Files          92       80       -12     
  Lines        9258     6635     -2623     
===========================================
- Hits         6153     3246     -2907     
- Misses       2359     2985      +626     
+ Partials      746      404      -342     
Flag Coverage Δ
e2etests ?
unittests 48.92% <50.00%> (+1.23%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: William Woodruff <william@trailofbits.com>
@haydentherapper
Copy link
Contributor

Would type:string and format:base64 be a breaking change? I assume this PR isn't breaking because objects are effectively format:byte?

@woodruffw
Copy link
Member Author

Would type:string and format:base64 be a breaking change? I assume this PR isn't breaking because objects are effectively format:byte?

type: object always means JSON object with no encapsulation in OpenAPI, so technically this isn't breaking since the schema definition here was incorrect in the first place (format: byte is just invalid here, but gets ignored).

I think changing it to type: string is technically a breaking change in the sense that it fixes a thing that's currently broken, but I suspect it wouldn't actually break anything (since other languages using the current spec seemingly haven't hit this condition before). So I could change this PR to the more correct type: string if you prefer 🙂

@haydentherapper
Copy link
Contributor

If I understand correctly, there's two things we should do:

  • Removing format from LogEntry.attestation, because a format for the object is meaningless (and its underlying data field correctly has format:byte)
  • Change body to a base64 string since it's not an object.

Is that accurate? Did you see any other non-compliances?

The other thing I don't know is why we set additionalProperties:true in attestation. Removing this would be "breaking", but a) Rekor doesn't set extra properties in the body field, and b) I don't think any client would handle this correctly anyways.

I'm good with making these correctness changes, though would you be able to double check if any sigstore clients are leveraging the openapi specs? I don't believe so.

Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member Author

  • Removing format from LogEntry.attestation, because a format for the object is meaningless (and its underlying data field correctly has format:byte)
  • Change body to a base64 string since it's not an object.

Is that accurate? Did you see any other non-compliances?

Those two sound correct to me! I've noticed one other thing, which I've fixed with 5b4eba2 -- the hashedrekord schema was using the same $id as the rekord schema despite being slightly distinct, causing mis-generation in OpenAPI clients that assume unique IDs.

I'll make the body/format changes above as well.

Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
@bobcallaway
Copy link
Member

I vaguely remember a bug where if additionalProperties wasn't specified, validation failed somewhere because we don't name all potential keys/properties of the JSON object within the schema. https://json-schema.org/understanding-json-schema/reference/object#additionalproperties says it is not strictly required, but not sure it hurts to be explicit?

@woodruffw
Copy link
Member Author

I vaguely remember a bug where if additionalProperties wasn't specified, validation failed somewhere because we don't name all potential keys/properties of the JSON object within the schema

I think I've seen that bug, but I think these changes won't trip it -- the additionalProperties @haydentherapper is talking about is on body, which is now type: string. So the Go codegen is now rejecting it outright 🙂

@woodruffw
Copy link
Member Author

Oh whoops, I misunderstood -- yeah, additionalProperties might be required in attestation but is no longer required in body, since we've corrected the type of body to type: string.

Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member Author

This ended up being a bit more invasive 😅 -- changing body to type: string means that LogEntryAnon.Body is no longer an interface in the generated Go, causing a cascade of internal changes.

Signed-off-by: William Woodruff <william@trailofbits.com>
Signed-off-by: William Woodruff <william@trailofbits.com>
// Required: true
Body interface{} `json:"body"`
Body *string `json:"body"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this would be a breaking change in terms of the library as we are leveraging the generated models in Cosign (and I assume other Go implementations) - https://github.com/sigstore/cosign/blob/02b1b2677b8fe008ccfea4bbd6f5cc96961a49a8/pkg/cosign/bundle/rekor.go#L42, https://github.com/sigstore/cosign/blob/02b1b2677b8fe008ccfea4bbd6f5cc96961a49a8/pkg/cosign/tlog.go#L256

@bobcallaway Would you be OK with this as a 1.x release, even if this is not necessarily following semver? The other option is waiting to a 2.x release and cleaning this up as part of larger API changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather defer to 2.x. Sorry for the extra work @woodruffw :/

Copy link
Contributor

@haydentherapper haydentherapper Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's file a bug to track.

@woodruffw, should we instead continue to omit a format string and simply document that body should be a base64-encoded string? Or would the format hint (even if not spec compliant) be helpful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation is sufficient, I think -- at the very least, the Rust bindings can hack around this (this will be another hack on the stack, but the stack is already quite large 🙂).

That being said: in the context of semver, I've argued before that bugfixes aren't semver violations, and I think there's a good argument that this is purely a bugfix (even through it results in a change to a public API): the current OpenAPI definition for LogEntry is incorrect, and results in broken bindings for any downstream consumer that doesn't have the equivalent of Go's interface type. Moreover, Go only incidentally works here because API users have to know to punch through the interface to get the actual string beneath it. So: if Rekor's semver contract includes bugfixes on the OpenAPI definition itself, I think there's a case for this not being a breaking change.

(I can understand if you'd still prefer to defer to 2.x, but I figured I'd advance this argument in case it changes your views here 🙂)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a bug to specify an object without any properties? https://stackoverflow.com/questions/67514763/is-it-valid-to-define-an-object-without-any-sub-properties-in-openapi-2-0-swagg suggests this is acceptable (I couldn't find an answer in the spec).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we originally chose object over string, I'd guess we hadn't standardized on how the API would return body. I get what you're saying (and I don't really mind breaking library users, but that's my bias), but I think the impact to existing Go clients will be significant enough that punting on this til v2 would be better.

Hm, oneOf is an interesting idea. Does that result in any library changes for the generated code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, oneOf is an interesting idea. Does that result in any library changes for the generated code?

I'll give that a try and find out!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, turns out this is a non-starter: the current OpenAPI spec is written in Swagger 2.0, which doesn't support oneOf. So I can't even switch to oneOf, unless we pushed the LogEntry definition into its own standalone JSON schema file.

Given that, I think I've exhausted all creative alternatives here and this just needs to be fixed in v2 😅. I'll close this and file an associated issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to still specify the format:base64 string as a hint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing so doesn't change the hacks we need to do on the Rust side, so I'm fine it omitting it 🙂 -- omitting it arguably also makes the OpenAPI spec slightly more compliant (which isn't a problem at the moment, but could cause problems for an even stricter generator).

Signed-off-by: William Woodruff <william@trailofbits.com>
@woodruffw
Copy link
Member Author

I've broken out #2092 for the non-type changes, since those are still needed for our downstream bindings 🙂

@woodruffw
Copy link
Member Author

Opened #2097 to track the underlying issue here. I'm going to re-draft this for the time being, since it isn't immediately actionable 🙂

@woodruffw woodruffw marked this pull request as draft April 23, 2024 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants