Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Server] Unreachable routes in operation_registry when only httpLabels are different #1009

Open
guymguym opened this issue Dec 28, 2021 · 10 comments · Fixed by #1036
Open

[Server] Unreachable routes in operation_registry when only httpLabels are different #1009

guymguym opened this issue Dec 28, 2021 · 10 comments · Fixed by #1036
Labels
server Rust server SDK

Comments

@guymguym
Copy link
Contributor

guymguym commented Dec 28, 2021

When generating codegen-server for S3, operation_registry has list_buckets and list_objects operations and the only difference in their routing is that list_objects specifies an http label to capture the bucket name from the uri path, whereas list_buckets specifies no labels. However the generated order (alphabetical) puts the list_buckets route first, which hides the second route altogether. It seems like any subsequent route is unreachable because the route of list_buckets matches any uri path, while it should match just for root path /.

This might require a fix in request_spec.rs where it builds the path regex:
https://github.com/awslabs/smithy-rs/blob/cbd61ab1fff94775e112f78181653fe040d08e24/rust-runtime/aws-smithy-http-server/src/routing/request_spec.rs#L78-L91

I tried to restrict it to match the entire path by changing the last line to the following code, which seems to help in some cases but I'm not sure this is correct - Regex::new(&format!("^/*{}$", re)).unwrap() (I had to use /* instead of /+ because it didn't work otherwise but I'm quite puzzled as to why).

This issue might be related to existing PR #996 Implement httpLabel with nom - so worth noting this issue when working on it.

For reference here are the router specs for list_buckets and list_objects which is only different in having a path label -

        "com.amazonaws.s3#ListBuckets": {
            "type": "operation",
            "output": {
                "target": "com.amazonaws.s3#ListBucketsOutput"
            },
            "traits": {
                "smithy.api#documentation": "<p>Returns a list of all buckets owned by the authenticated sender of the request.</p>",
                "smithy.api#http": {
                    "method": "GET",
                    "uri": "/",
                    "code": 200
                }
            }
        },

        "com.amazonaws.s3#ListObjects": {
            "type": "operation",
            "input": {
                "target": "com.amazonaws.s3#ListObjectsRequest"
            },
            "output": {
                "target": "com.amazonaws.s3#ListObjectsOutput"
            },
            "errors": [
                {
                    "target": "com.amazonaws.s3#NoSuchBucket"
                }
            ],
            "traits": {
                "smithy.api#documentation": "...",
                "smithy.api#http": {
                    "method": "GET",
                    "uri": "/{Bucket}",
                    "code": 200
                }
            }
        },

CC @crisidev

@guymguym
Copy link
Contributor Author

Ok I think I understand the problem - the fold() call in request_spec.rs works correctly for anything that has at least one segment, but the ListBuckets operation has no segments at all which makes the result be an empty string instead of a /.

So I tested this fix and it seems to work ok for my cases:

-        Regex::new(&format!("{}$", re)).unwrap()
+        let re = if re.is_empty() { String::from(sep) } else { re };
+        Regex::new(&format!("^{}$", re)).unwrap()

@crisidev
Copy link
Contributor

I think you caught another bug here. I am not fond of our implementation using regexes and we did it like this because it was simple enough in the beginning.

I think we can do two things here:

  1. wait for Implement httpLabel with nom #996 to land and ensure we also cover this use-case inside the request spec implementation. We would probably want to move away from regex also in the RequestSpec and use a real parser like we are doing for httpLabel..
  2. make a PR basically checking if there is at least one segment as you did above to unblock this problem

Regarding 2) it is unlikely I will find someone able to approve any code until January, so I am not sure it will be quicker.

@guymguym
Copy link
Contributor Author

Hey @crisidev Thanks for the reply, and no worries about timelines I am unblocking myself with hacky fixes on a fork -
s3d-rs@b5ccdde and just wanted to document these issues so that we can discuss it upstream. Have a nice vacation!

@guymguym
Copy link
Contributor Author

Just adding a note that my suggested fix is not really enough for other cases, because S3 has more "semi overlapping" routes such as CreateBucket (PUT /{Bucket}) vs PutBucketAcl (PUT /{Bucket}?acl) and since the routes match order is just alphabetical we always match the CreateBucket route first and never get to the PutBucketAcl one. There are some more cases like those (PutObject/PutObjectAcl/CopyObject and more).

@crisidev
Copy link
Contributor

Yep, i was pretty sure this would happen.. it's going to be a lot of fun to catch and fix all the corner cases found in this model. My understanding is that this is a quite old model, hence it will require a bit of customization to adapt against the strictness of the smithy specs..

david-perez added a commit that referenced this issue Jan 4, 2022
…iguation

This commit implements basic URI pattern conflict disambiguation when
routing incoming requests to service operations.

It does this by introducing a measure of how "important" a `RequestSpec`
is. The more specific a `RequestSpec` is, the higher it ranks in
importance. Specificity is measured by the number of segments plus the
number of query string literals in its URI pattern, so
`/{Bucket}/{Key}?query` is more specific than `/{Bucket}/{Key}`, which
is more specific than `/{Bucket}`, which is more specific than `/`.

Why do we need this?

Note that:

1. the Smithy spec does not define how servers should route incoming requests in the
   case of pattern conflicts; and
2. the Smithy spec even outright rejects conflicting patterns that can be easily
    disambiguated e.g. `/{a}` and `/{label}/b` cannot coexist.

We can't to anything about (2) since the Smithy CLI will refuse to build a model with those
kind of conflicts. However, the Smithy CLI does allow _other_ conflicting patterns to
coexist, e.g. `/` and `/{label}`. We therefore have to take a stance on (1), since if we
route arbitrarily we render basic usage impossible (see issue #1009).

So this ranking of routes implements some basic pattern conflict disambiguation with some
common sense. It's also the same behavior that the TypeScript sSDK is implementing [0].

This commit also:

* makes the `future` module private,
* hides documentation for the `request_spec` module; and
* makes some fields from some structs in the `request_spec` module
  private and adds constructors.

Closes #1009.

[1]: https://github.com/awslabs/smithy-typescript/blob/d263078b81485a6a2013d243639c0c680343ff47/smithy-typescript-ssdk-libs/server-common/src/httpbinding/mux.ts#L59.
david-perez added a commit that referenced this issue Jan 4, 2022
…iguation

This commit implements basic URI pattern conflict disambiguation when
routing incoming requests to service operations.

It does this by introducing a measure of how "important" a `RequestSpec`
is. The more specific a `RequestSpec` is, the higher it ranks in
importance. Specificity is measured by the number of segments plus the
number of query string literals in its URI pattern, so
`/{Bucket}/{Key}?query` is more specific than `/{Bucket}/{Key}`, which
is more specific than `/{Bucket}`, which is more specific than `/`.

Why do we need this?

Note that:

1. the Smithy spec does not define how servers should route incoming
   requests in the case of pattern conflicts; and
2. the Smithy spec even outright rejects conflicting patterns that can
   be easily disambiguated e.g. `/{a}` and `/{label}/b` cannot coexist.

We can't to anything about (2) since the Smithy CLI will refuse to build
a model with those kind of conflicts. However, the Smithy CLI does allow
_other_ conflicting patterns to coexist, e.g. `/` and `/{label}`. We
therefore have to take a stance on (1), since if we route arbitrarily we
render basic usage impossible (see issue #1009).

So this ranking of routes implements some basic pattern conflict
disambiguation with some common sense. It's also the same behavior that
the TypeScript sSDK is implementing [0].

This commit also:

* makes the `future` module private,
* hides documentation for the `request_spec` module; and
* makes some fields from some structs in the `request_spec` module
  private and adds constructors.

Closes #1009.

[1]: https://github.com/awslabs/smithy-typescript/blob/d263078b81485a6a2013d243639c0c680343ff47/smithy-typescript-ssdk-libs/server-common/src/httpbinding/mux.ts#L59.
@david-perez
Copy link
Contributor

@guymguym Thanks for reporting this. Can you try the patch #1036?

david-perez added a commit that referenced this issue Jan 4, 2022
…iguation

This commit implements basic URI pattern conflict disambiguation when
routing incoming requests to service operations.

It does this by introducing a measure of how "important" a `RequestSpec`
is. The more specific a `RequestSpec` is, the higher it ranks in
importance. Specificity is measured by the number of segments plus the
number of query string literals in its URI pattern, so
`/{Bucket}/{Key}?query` is more specific than `/{Bucket}/{Key}`, which
is more specific than `/{Bucket}`, which is more specific than `/`.

Why do we need this?

Note that:

1. the Smithy spec does not define how servers should route incoming
   requests in the case of pattern conflicts; and
2. the Smithy spec even outright rejects conflicting patterns that can
   be easily disambiguated e.g. `/{a}` and `/{label}/b` cannot coexist.

We can't to anything about (2) since the Smithy CLI will refuse to build
a model with those kind of conflicts. However, the Smithy CLI does allow
_other_ conflicting patterns to coexist, e.g. `/` and `/{label}`. We
therefore have to take a stance on (1), since if we route arbitrarily we
render basic usage impossible (see issue #1009).

So this ranking of routes implements some basic pattern conflict
disambiguation with some common sense. It's also the same behavior that
the TypeScript sSDK is implementing [0].

This commit also:

* makes the `future` module private,
* hides documentation for the `request_spec` module; and
* makes some fields from some structs in the `request_spec` module
  private and adds constructors.

Closes #1009.

[1]: https://github.com/awslabs/smithy-typescript/blob/d263078b81485a6a2013d243639c0c680343ff47/smithy-typescript-ssdk-libs/server-common/src/httpbinding/mux.ts#L59.
david-perez added a commit that referenced this issue Jan 13, 2022
…iguation (#1036)

This commit implements basic URI pattern conflict disambiguation when
routing incoming requests to service operations.

It does this by introducing a measure of how "important" a `RequestSpec`
is. The more specific a `RequestSpec` is, the higher it ranks in
importance. Specificity is measured by the number of segments plus the
number of query string literals in its URI pattern, so
`/{Bucket}/{Key}?query` is more specific than `/{Bucket}/{Key}`, which
is more specific than `/{Bucket}`, which is more specific than `/`.

Why do we need this?

Note that:

1. the Smithy spec does not define how servers should route incoming
   requests in the case of pattern conflicts; and
2. the Smithy spec even outright rejects conflicting patterns that can
   be easily disambiguated e.g. `/{a}` and `/{label}/b` cannot coexist.

We can't to anything about (2) since the Smithy CLI will refuse to build
a model with those kind of conflicts. However, the Smithy CLI does allow
_other_ conflicting patterns to coexist, e.g. `/` and `/{label}`. We
therefore have to take a stance on (1), since if we route arbitrarily we
render basic usage impossible (see issue #1009).

So this ranking of routes implements some basic pattern conflict
disambiguation with some common sense. It's also the same behavior that
the TypeScript sSDK is implementing [0].

This commit also:

* makes the `future` module private,
* hides documentation for the `request_spec` module; and
* makes some fields from some structs in the `request_spec` module
  private and adds constructors.

Closes #1009.

[1]: https://github.com/awslabs/smithy-typescript/blob/d263078b81485a6a2013d243639c0c680343ff47/smithy-typescript-ssdk-libs/server-common/src/httpbinding/mux.ts#L59.
@david-perez
Copy link
Contributor

Reopening, this is not fixed yet. I had to revert the test that was explicitly added for the route conflict in this issue in #1070.

It seems like allowing both empty URI segments and the simple pattern conflict disambiguation using weighted segments that I implemented are at odds with each other.

We're tracking this issue in the Smithy repo smithy-lang/smithy#1024.

Ideas for how to have both empty URI segments and simple pattern conflict disambiguation are welcome. We should ensure a stance is taken on this issue in the new pattern conflict disambiguation proposal smithy-lang/smithy#1029 (comment).

@guymguym
Copy link
Contributor Author

Hi @david-perez
I just saw you fixed some of this issue in #1484 which is great!

I had a similar attempt to fix routing issues for S3 in #1097, and I still carry those as a patch on my fork to support S3 routing, so at least one of those patches is now in the upstream!

I wonder what do you think about the fix to set routes rank to have a higher weight to paths over queries - this is needed because if paths and queries have the same weight then it is not possible to match the route /Bucket/Key because any route that has the same path with a query like /Bucket/Key?Acl will "hide" it. It seems that the comment on the rank function described the same consideration. Is there any discrepancy between what I needed for routing ranks and the general case?

Thanks!

@david-perez
Copy link
Contributor

I'm not sure I understand. Given:

  1. spec A: /bucket/key; and
  2. spec B: /bucket/key?acl,

Then, if a request has URI /bucket/key?acl, you want it routed to A? Currently it should be routed to B, because even though both spec A and spec B match the incoming request's URI, spec B is more specific, as the comment you linked to explains.

@guymguym
Copy link
Contributor Author

Ok, you're right off course, sorry for the mixup. I went back to review my patch, and I see that in my patches I tried to hack out the need for ?x-id=OperationName queries for #1012 (because it is not compatible with other aws sdk/cli's), and when I did that it messed up the routing sort and introduced new unreachable operations. I believe that patching the rank was my attempt to fix some of those cases. Anyhow, this current issue is still open due to empty labels, so please just ignore my last comment. Thanks @david-perez !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
server Rust server SDK
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants