-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add the name resolver API #2285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2867b3f to
3c30ab8
Compare
3c30ab8 to
07eb017
Compare
|
|
||
| /// The delay for the next retry, without the random jitter. Store as f64 | ||
| /// to avoid rounding errors. | ||
| next_delay_secs: Mutex<f64>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need a mutex? It seems like we should generally only be accessing these serially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct access is serial. The mutex is added to allow interior mutability. @easwars and I have a discussion about this on the group. The subchannel will also use backoffs and it will pass the backoff as an immutable value to keep the subchannel API clean. To get rid of the mutex we would need to use mutable acceptors in the trait definition and have the subchannel wrap the backoff in a mutex to get a mutable ref.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to extend the mut receiver out as far as we can. The mutex can go into the subchannel if that's what it needs to do. The backoff can express that it should not be called concurrently by making this a mutable receiver instead of an immutable one. The behavior of concurrent accesses would be undefined anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the mutex from the backoff and used mutable receivers.
| /// returns an empty endpoint list but a valid service config may set | ||
| /// to this to something like "no DNS entries found for <name>". | ||
| pub resolution_note: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some requirement for using this? Like should it only be set if endpoints is empty? I.e. should we delete this and use an Err() from endpoints instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, the resolution_note can be set when endpoints are present. The resolution note may say something like "the endpoints are stale" and the LB policy will add this message to the failure message if it fails to connect to the endpoints. Here is the code in CPP that sets a resolution node with a non-empty list of addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay that makes sense.
So then the valid states we want to represent are:
endpoints |
service_config |
resolution_note |
|---|---|---|
Err |
Err |
None |
Err |
Ok |
None |
Ok |
Err |
None |
Ok |
Ok |
None |
Ok |
Ok |
Some |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the doc string in CPP and it states the condition for the note to be present:
/// for inclusion in RPC failure status
/// messages in cases where neither \a addresses nor \a service_config
/// has a non-OK status.
Removing the negative logic, it mentions "for inclusion in RPC failure status when both addresses and service config are Ok status". The above table is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through the docstring in C++ and it mentions the conditions under which the resolution note is set:
in cases where neither \a addresses nor \a service_config has a non-OK status.
Removing the negative logic, we get "in cases where both addresses and service_config have an OK status." So the table above is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you suggest creating an enum that only represents valid states?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can stick with this for now since it matches C++, but we might want to change things before making a public resolver API. I always prefer when illegal states are not possible to represent in the first place, but maybe that isn't feasible here.
| pub resolution_note: Option<String>, | ||
| } | ||
|
|
||
| impl Default for ResolverUpdate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be derived? Or no because of Result? If the latter, is there some way to partially derive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't be derived directly due to the Result. We can't implement the Default trait for Result<T, E> because Result is from a foreign crate. We can wrap Result<Option<ServiceConfig>, String> in our own type, but that would add an extra layer for accessing the ServiceConfig/Entpoints etc.
I couldn't find a way to avoid listing the fields that already implement Default without adding a crate with macros. The closest think I found is using Default::default() as the value for such fields (e.g: attributes: Arc::default()). Using ..Default::default() at the end of the struct in the implementation of default() causes infinite recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this plus my other comment about using result at a higher level kinda tie together that maybe this api is going against the grain?
dfawley
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but it would be great to get a review from @LucioFranco too
|
|
||
| impl BackoffConfig { | ||
| fn validate(&self) -> Result<(), &'static str> { | ||
| // Valid that params are in valid ranges. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s/Valid/Validate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased the sentence to avoid repeating the term "Valid".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a discussion about this with @dfawley a while back when we were talking about something else. https://doc.rust-lang.org/book/ch07-05-separating-modules-into-different-files.html#alternate-file-paths
Should we use the new style file paths instead? It would be easier to do that earlier than later (when we have more code). Doesn't have to happen as part of this PR, but if we have consensus, then we can fix in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this in a meeting, and Lucio said we should keep using <module_name>/mod.rs instead of <module_name.rs> plus <module_name>/<sub_modules>. (@LucioFranco, please correct me if I am misrepresenting or misremembering.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is no real preference in the community over each path and historically most projects would use mod.rs and so they have stuck with it. Tonic is in the same boat, honestly, I prefer mod.rs it feels more natural rust wise. Most important thing is that we stay consistent.
|
|
||
| /// Returns either host:port or host depending on the existence of the port | ||
| /// in the authority. | ||
| pub fn authority_host_port(&self) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason why this method alone is returning a String while others are returning a &str?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because of the line that constructs a string below:
format!("{}:{}", host, port)Since the string is created in the function, we can't return a ref to it because the ref will outlive the lifetime of the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider doing something like what the http crate does https://docs.rs/http/latest/src/http/uri/authority.rs.html#261 where its internal rep is a string and then to return just the port it does some string parsing. Though this might be tough with the Url crate...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this being too much of a problem since this is called once while the channel's resolver is created. If needed, we could store the host_port as a struct field during construction to avoid an allocation on every call.
| target: &super::Target, | ||
| options: super::ResolverOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any recommendation about when to use super::Struct as opposed to having a use statement for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been using qualified names when the symbol is used once or twice and falling back to a use statement when the symbol is used more frequently. @LucioFranco wanted to get your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I always use the import style and ONLY use the qualified if for example there are two structs with the same name and/or it just makes sense with the combination of the structs module name + its name in the code. In this case, super means nothing I would just import it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to use the import style throughout.
|
Changes LGTM. I don't have required privileges to mark a comment as resolved, looks like. So, I just added a "thumbs-up" to the ones that I thought were sufficiently addressed, and left the other ones as-is. |
grpc/src/rt/mod.rs
Outdated
| } | ||
|
|
||
| #[derive(Default)] | ||
| pub struct ResolverOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the plan to expose structs with public fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to support non-tokio runtimes for a Google internal user, but this will happen after the initial preview release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right the question was more so about how we make fields public. By default in rust we should not make fields pub as it makes maintaining a non breaking api easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I make these fields pub(crate) for now and defer the API design decision until we decide to expose them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is def better, but maybe consider pub(super) to constrain any usage of the fields to things inside that module and anything you need outside you can get via function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to pub(super). I also went through all other pub struct and pub traits to reduce the scope if the API isn't going to be public.
| pub fn add_builder(&self, builder: Box<dyn ResolverBuilder>) { | ||
| let scheme = builder.scheme(); | ||
| if scheme.chars().any(|c| c.is_ascii_uppercase()) { | ||
| panic!("Scheme must not contain uppercase characters: {}", scheme); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be an error eventually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolver builders should be added when the application starts up, before any RPCs are made. So we decided that panicking here is acceptable. In the get function below, we lowercase the scheme because get is be called when RPCs are made, so we want to avoid failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also add a variant that is try_ that returns a result and then this add_ variant will just panic on that and that allows users to choose the behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a fallible variation of the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is no real preference in the community over each path and historically most projects would use mod.rs and so they have stuck with it. Tonic is in the same boat, honestly, I prefer mod.rs it feels more natural rust wise. Most important thing is that we stay consistent.
arjan-bal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply, I missed the review notification because it went to my non-work email.
| pub fn add_builder(&self, builder: Box<dyn ResolverBuilder>) { | ||
| let scheme = builder.scheme(); | ||
| if scheme.chars().any(|c| c.is_ascii_uppercase()) { | ||
| panic!("Scheme must not contain uppercase characters: {}", scheme); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolver builders should be added when the application starts up, before any RPCs are made. So we decided that panicking here is acceptable. In the get function below, we lowercase the scheme because get is be called when RPCs are made, so we want to avoid failures.
grpc/src/rt/mod.rs
Outdated
| } | ||
|
|
||
| #[derive(Default)] | ||
| pub struct ResolverOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to support non-tokio runtimes for a Google internal user, but this will happen after the initial preview release.
| use super::Target; | ||
|
|
||
| #[test] | ||
| pub fn parse_target() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied most of these tests from gRPC Go. I haven't looked into fuzz testing and I believe only gRPC c-core does fuzz testing. I can take a look in my free time.
|
|
||
| /// The address itself is passed to the transport in order to create a | ||
| /// connection to it. | ||
| pub address: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduced a wrapper type similar to http's ByteStr and used it for the address field.
No description provided.