Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the matching behavior with regards to empty URI path segments? #1024

Open
david-perez opened this issue Dec 22, 2021 · 7 comments
Open
Labels
client This issue involves the specification for client software. documentation This is a problem with documentation. server This issue involves the specification for server software.

Comments

@david-perez
Copy link
Contributor

david-perez commented Dec 22, 2021

Currently,

This behavior matches what we anecdotally see when visiting most websites, e.g. these URLs behave the same as if their empty path segments were removed:

While this behavior might be perceived as intuitive, it makes it impossible for someone to call a service and assign an empty string to a field that is bound to an @httpLabel. Consider the URI pattern /foo/{label}/bar and the request with URI path /foo//bar. Here the server should assign "" to label, but if it strips away empty path segments prior to routing the request, the request would be outright rejected.

I think the Smithy spec should explicitly call out the behavior in the specification, since it currently does not prescribe what to do in this scenario (and I would therefore interpret it as the default, which would be to not strip away empty path segments), yet at least three implementers have independently decided to strip away empty path segments.

Some other data points to have in mind:

  • The URI spec says that empty path segments are allowed.
  • There are websites in which empty path segments do have meaning: https://en.wikipedia.org/wiki///
  • Do HTTP clients strip away empty path segments? I don't know of any that do.
@JordonPhillips JordonPhillips added the documentation This is a problem with documentation. label Dec 22, 2021
@JordonPhillips
Copy link
Contributor

JordonPhillips commented Dec 22, 2021

Allowing labels to be empty can be problematic if the segment is at the end of a URI. S3 is the prime example of this because the URI for ListBuckets is / and the URI for ListObjects is /{Bucket}. So if you were to give an empty string for Bucket, you would get the wrong operation entirely. At best this would cause deserialization to blow up, but more likely you'll just get a phantom empty response as deserializers will drop unexpected data. I recall fielding this issue more than once when I worked on the Python SDK, and am aware of it cropping up several times for other SDKs as well.

There are websites in which empty path segments do have meaning: https://en.wikipedia.org/wiki///

Philosophically that's not an empty segment, but a /wiki/{id} URI with the data //. (This is just splitting hairs though.) Since we require label values to be url-encoded we would encode that as /wiki/%2F%2F. Theoretically that would eliminate the ambiguity, though it really depends on when you do the decode. If we don't already have protocol tests that look like that, then we do need to add them.

All in all I think allowing empty path segments is a footgun, and disallowing them is a good thing.

All that said, stripping empty path segments is probably a very bad idea. Turning /foo/{bar}/baz into /foo/baz is the exact same kind of issue as giving S3 an empty bucket. Stripping by default also makes it impossible for us to change to allow empty segments later on, or to make it a protocol implementation detail

@david-perez
Copy link
Contributor Author

I agree that allowing empty path segments may inadvertently cause bad API design, but the Smithy spec does not disallow them. Perhaps Smithy CLI could emit a warning if it detects a string bound with @httpLabel without a @length(min: 1) constraint?

Even if the Smithy spec disallowed them for clients, the spec should clarify what servers should do when encountering empty path segments. If they don't strip them away before extracting labels, then servers will have to assign empty strings to the input fields bound to those segments, right? Is there another alternative?

@JordonPhillips
Copy link
Contributor

Yeah they would have to equate to empty strings.

@gosar
Copy link
Contributor

gosar commented Dec 23, 2021

To clarify what we'd like to do, a case to think through is: if a client sends a request /foo/bar it would match /foo/bar only. i.e., not try to match /foo/bar/{label} with empty value for label. But if client sends /foo/bar/, it would try to match /foo/bar/{label} with empty value for label, and not match /foo/bar, right?

@david-perez
Copy link
Contributor Author

@gosar Yes, I think that should be the behavior, trailing slashes have meaning. It is also consistent with how websites serve HTML documents e.g. /foo/bar/ will probably resolve to /foo/bar/index.html whereas /foo/bar corresponds to the document /foo/bar.html.

david-perez added a commit to smithy-lang/smithy-rs that referenced this issue Jan 4, 2022
In #996 [0], we realized that we were ignoring empty URI path segments
when doing label extraction. It turns out we are also ignoring them when
doing routing, as the TypeScript sSDK currently does [1], from which the
initial implementation was copied.

However, a discussion with the Smithy team in smithy-lang/smithy#1024 [2]
revealed that we _must not_ ignore empty URI path segments when routing
or doing label extraction, since empty strings (`""`) should be assigned
to the labels in those segments.

This commit fixes the behavior so that we don't ignore empty path
segments _when doing routing_. #996 will take care of fixing the behavior
when doing label extraction.

[0]: #996 (comment)
[1]: https://github.com/awslabs/smithy-typescript/blob/d263078b81485a6a2013d243639c0c680343ff47/smithy-typescript-ssdk-libs/server-common/src/httpbinding/mux.ts#L78
[2]: smithy-lang/smithy#1024
david-perez added a commit to smithy-lang/smithy-rs that referenced this issue Jan 12, 2022
…ng (#1029)

In #996 [0], we realized that we were ignoring empty URI path segments
when doing label extraction. It turns out we are also ignoring them when
doing routing, as the TypeScript sSDK currently does [1], from which the
initial implementation was copied.

However, a discussion with the Smithy team in smithy-lang/smithy#1024 [2]
revealed that we _must not_ ignore empty URI path segments when routing
or doing label extraction, since empty strings (`""`) should be assigned
to the labels in those segments.

This commit fixes the behavior so that we don't ignore empty path
segments _when doing routing_. #996 will take care of fixing the behavior
when doing label extraction.

[0]: #996 (comment)
[1]: https://github.com/awslabs/smithy-typescript/blob/d263078b81485a6a2013d243639c0c680343ff47/smithy-typescript-ssdk-libs/server-common/src/httpbinding/mux.ts#L78
[2]: smithy-lang/smithy#1024
aws-sdk-rust-ci pushed a commit to awslabs/aws-sdk-rust that referenced this issue Jan 12, 2022
… when routing (#1029)

In #996 [0], we realized that we were ignoring empty URI path segments
when doing label extraction. It turns out we are also ignoring them when
doing routing, as the TypeScript sSDK currently does [1], from which the
initial implementation was copied.

However, a discussion with the Smithy team in smithy-lang/smithy#1024 [2]
revealed that we _must not_ ignore empty URI path segments when routing
or doing label extraction, since empty strings (`""`) should be assigned
to the labels in those segments.

This commit fixes the behavior so that we don't ignore empty path
segments _when doing routing_. #996 will take care of fixing the behavior
when doing label extraction.

[0]: smithy-lang/smithy-rs#996 (comment)
[1]: https://github.com/awslabs/smithy-typescript/blob/d263078b81485a6a2013d243639c0c680343ff47/smithy-typescript-ssdk-libs/server-common/src/httpbinding/mux.ts#L78
[2]: smithy-lang/smithy#1024
@david-perez
Copy link
Contributor Author

I've just stumbled upon a routing issue caused by allowing empty path segments. Consider the S3-like model with two operations:

  1. ListBuckets with URI pattern /.
  2. ListObjects with URI pattern /{bucket}.

And consider the request /. Where should we route it to? ListObjects has more weight than ListBuckets, so we should match against its URI pattern first. And it matches, binding "" to the bucket label.

However, in this scenario it's clear that it should be routed to ListBuckets.

@JordonPhillips
Copy link
Contributor

Yeah, this is the S3 issue I mentioned before and is one of the primary motivating factors of why I don't think empty path segments should be allowed

Velfi pushed a commit to awslabs/aws-sdk-rust that referenced this issue Jan 19, 2022
… when routing (#1029)

In #996 [0], we realized that we were ignoring empty URI path segments
when doing label extraction. It turns out we are also ignoring them when
doing routing, as the TypeScript sSDK currently does [1], from which the
initial implementation was copied.

However, a discussion with the Smithy team in smithy-lang/smithy#1024 [2]
revealed that we _must not_ ignore empty URI path segments when routing
or doing label extraction, since empty strings (`""`) should be assigned
to the labels in those segments.

This commit fixes the behavior so that we don't ignore empty path
segments _when doing routing_. #996 will take care of fixing the behavior
when doing label extraction.

[0]: smithy-lang/smithy-rs#996 (comment)
[1]: https://github.com/awslabs/smithy-typescript/blob/d263078b81485a6a2013d243639c0c680343ff47/smithy-typescript-ssdk-libs/server-common/src/httpbinding/mux.ts#L78
[2]: smithy-lang/smithy#1024
@github-actions github-actions bot added the closing-soon This issue will automatically close in 7 days unless further comments are made. label Aug 19, 2023
@jvschneid jvschneid removed the closing-soon This issue will automatically close in 7 days unless further comments are made. label Aug 21, 2023
@smithy-lang smithy-lang deleted a comment from github-actions bot Aug 21, 2023
@kstich kstich added client This issue involves the specification for client software. server This issue involves the specification for server software. labels Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client This issue involves the specification for client software. documentation This is a problem with documentation. server This issue involves the specification for server software.
Projects
None yet
Development

No branches or pull requests

5 participants