Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utilize new address type in pgv for better errors #7823

Closed
wants to merge 2 commits into from

Conversation

Mythra
Copy link
Member

@Mythra Mythra commented Aug 3, 2019

Description:

Utilize the "address" type string validation that has been added to newer versions of protoc-gen-validate. This can help fix error messages that are poor (stoi), and give you an actual: "this is an invalid address" type flag.

Risk Level: Medium.

This should cover all the same address types. I set the risk level to medium though because the chance is always there. PGV doesn't check unix domain sockets, but my understanding is those use a seperate protobuf: Pipe type.

It'd be nice if people were able to run their configs through it too though.

Testing:

I've prepared some example configs (using the original config we found the issue with, which uses some deprecated fields, as well as a config that isn't deprecated): HERE.

If you run these configs through --mode validate. You will see before this patchset they unhelpfully give an error message of: stoi (this is because it tries to split: https://google.com, and turn: //google.com into a port, which fails).

Now with this patch included you will instead see a more reasonable message of:

[2019-08-03 17:24:08.665][3767][critical][main] [source/server/config_validation/server.cc:58] error initializing configuration '/tmp/new-example.yaml': Proto constraint validation failed (BootstrapValidationError.StaticResources: ["embedded message failed validation"] | caused by StaticResourcesValidationError.Clusters[i]: ["embedded message failed validation"] | caused by ClusterValidationError.LoadAssignment: ["embedded message failed validation"] | caused by ClusterLoadAssignmentValidationError.Endpoints[i]: ["embedded message failed validation"] | caused by LocalityLbEndpointsValidationError.LbEndpoints[i]: ["embedded message failed validation"] | caused by LbEndpointValidationError.Endpoint: ["embedded message failed validation"] | caused by EndpointValidationError.Address: ["embedded message failed validation"] | caused by AddressValidationError.SocketAddress: ["embedded message failed validation"] | caused by SocketAddressValidationError.Address: ["value must be an ip address, or a hostname."]): static_resources {

The end containing the actual error message (value must be an ip address, or a hostname).

Ideally you have some configs you can run through to ensure no false positives if tests don't cover those.

Docs Changes: None should be needed since it's the same field, just a better error message.
Release Notes: N/A

this utilizes the new "address" validation type for better error
messages when validating configuration of the "address" type.

Signed-off-by: Cynthia Coan <ccoan@instructure.com>
@Mythra
Copy link
Member Author

Mythra commented Aug 4, 2019

Looks like some tests, define invalid addresses. Like: web_service, and: main_website.com. I'm going to change these to valid hostnames, and run all the tests to ensure it's valid.

Signed-off-by: Cynthia Coan <ccoan@instructure.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/.

🐱

Caused by: #7823 was synchronize by SecurityInsanity.

see: more, trace.

@Mythra
Copy link
Member Author

Mythra commented Aug 4, 2019

Okay, I've fixed those addresses to be values.

However, it looks like the lds tests, utilize socket address with: tcp:// in the beginning. I assume this is something we want to continue supporting. Since that migration would be bigger.

I can imagine a couple fixes:

  1. Split out the uses of Socket Addresses with protocols vs those that don't accept it. (seems unideal, requires api changes).
  2. Before calling validate pre-parse protocol, and split it out (seems unideal, cause we modify messages).
  3. Create a new validation type: address_with_protocol inside of PGV, merge that to PGV, upgrade it in envoy, and use that. Fix the stoi error message by itself, allowing it to properly take a protocol. (seems ideal, but requires a decent amount of overhead, which could be fine).
  4. Fix the stoi error message by itself, and keep the validation message the same. (easiest, but it seems weird we have the same type meaning two different things).
  5. Make the LDS service not send protocols as part of the address. (afaik this is the only place that uses it, but would require a breaking change).

@htuch htuch self-assigned this Aug 5, 2019
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question from @envoyproxy/api-shepherds perspective: how does this interact with address resolvers?

Today, we can do resolver plugins that can take the address and reinterpret, and they may not be subject to the same restrictions as hostnames or IP addresses. I'd like this kind of detailed error validation for regular addresses but not sure if we can break the existing users of resolvers who don't play by these rules.

@Mythra
Copy link
Member Author

Mythra commented Aug 5, 2019

@htuch I believe custom resolvers fit in along the same line of the LDS Tests (which I tried to detail above, which is hopefully coherent). To be more specific, it would break if a custom resolver is using a non socket address in the socket address field. This can be something simple like: tcp:// in the field value, or something more exotic.

If that is also something being supported, then I feel like option #4 detailed above might be best. (i.e. not rolling forward with this change, and instead fixing the case of the bad error message in this single caller we discovered).

If it's something we don't wanna support that's probably a much larger discussion than what was supposed to be a small PR, and it's probably best to close this, and open up a more formal issue (perhaps fixing the one caller in the meantime).

@htuch
Copy link
Member

htuch commented Aug 6, 2019

@securityinsanity I'm not sure if we need to support tcp:// in socket address going forward, but as an API principle, we're trying to avoid making PGV annotations more restrictive (it's fine to make them less restrictive). This is part of #6271.

The main issue here is that I know we have internally discussed having resolvers from a similar form, e.g. foobar://some/internal/format. We don't use this today, so it won't break Google. I would definitely want to be sure that this would not impact other users.

FWIW, in UDPA (next gen xDS), we will have an opportunity to revisit fields and we might want to support IP addresses as first class, distinct from resolvable, in which case we can use these stricter annotations. Ideally we would have cross-field annotations in the form of CEL or something in PGV, which would allow this even today (i.e. "if the resolver is default, apply the host name / IP address restriction").

@Mythra
Copy link
Member Author

Mythra commented Aug 6, 2019

@htuch Thanks for that context. This all makes sense, taking this into account. I'm going to close this PR. Then open up another one to fix this particular stoi error (since I think we can all agree that isn't a helpful error message).

@Mythra Mythra closed this Aug 6, 2019
@Mythra Mythra deleted the address branch August 6, 2019 02:34
@htuch
Copy link
Member

htuch commented Aug 6, 2019

Yep, thanks. Apologies for not spotting this earlier.

@Mythra
Copy link
Member Author

Mythra commented Aug 6, 2019

No worries at all.

Mythra pushed a commit to Mythra/envoy that referenced this pull request Aug 6, 2019
revival of envoyproxy#7823

this fixes a particular exception case where an end-user configures
a socket address to point at: `https://google.com` instead of just
the hostname: `google.com`.

instead of throwing an (stoi) error. because it can't turn:
`//google.com` into a port. we now through a slightly more sane:
`malformed url`. it isn't as good as a protoc-gen-validate message
since it's not clear why: `tcp://` gets prepended, but I think it's
digestable enough to know, you shouldn't be putting https there.
not to mention it is a step forward from: `stoi`.

Signed-off-by: Cynthia Coan <ccoan@instructure.com>
lizan pushed a commit that referenced this pull request Aug 7, 2019
revival of #7823

this fixes a particular exception case where an end-user configures
a socket address to point at: `https://google.com` instead of just
the hostname: `google.com`.

instead of throwing an (stoi) error. because it can't turn:
`//google.com` into a port. we now through a slightly more sane:
`malformed url`. it isn't as good as a protoc-gen-validate message
since it's not clear why: `tcp://` gets prepended, but I think it's
digestable enough to know, you shouldn't be putting https there.
not to mention it is a step forward from: `stoi`.

Risk Level: Low

Testing:

Not only are there unit tests, but you can run through the following configurations and
see the new error messages.

Docs Changes: None
Release Notes: None

Signed-off-by: Cynthia Coan <ccoan@instructure.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants