-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Cosmos | Connection endpoint rediscovery (connection state listener) #14697
Changes from 56 commits
84a37ab
d218925
11cbbf3
05c7e05
8dfa3db
e6e71f5
e082d81
069c822
3ead1cc
669ba8b
735f572
6db330d
48855b5
2367f50
fb9cac0
1dd708e
203507a
368b10f
a0b44a7
d810ba8
64a12af
29f2c73
e4df526
916f99e
ae1b11b
019e90b
ba4d4ed
2f0b886
ba13859
3293e11
a83cd62
1902669
f466158
6b72281
4a0360e
ea4ca55
a04d40d
18ab502
d2d892c
201b643
fa9eaf6
7c830cf
d4c8561
6c57447
b08361e
1ca1c3b
233f5b9
a631804
2079b5b
64d2111
bd54f11
94834b9
130de85
4ff0680
45ffb90
c280476
9f57a9c
77f3e88
cc2b936
ff4d980
ac593ec
4502126
90d98b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -288,25 +288,37 @@ public static class StatusCodes { | |
} | ||
|
||
public static class SubStatusCodes { | ||
|
||
// Unknown SubStatus Code | ||
public static final int UNKNOWN = 0; | ||
|
||
// 400: Bad Request substatus | ||
// 400: Bad Request sub-status | ||
public static final int PARTITION_KEY_MISMATCH = 1001; | ||
public static final int CROSS_PARTITION_QUERY_NOT_SERVABLE = 1004; | ||
|
||
// 410: StatusCodeType_Gone: substatus | ||
// 410: StatusCodeType_Gone: sub-status | ||
public static final int NAME_CACHE_IS_STALE = 1000; | ||
public static final int PARTITION_KEY_RANGE_GONE = 1002; | ||
public static final int COMPLETING_SPLIT = 1007; | ||
public static final int COMPLETING_PARTITION_MIGRATION = 1008; | ||
|
||
// 403: Forbidden substatus | ||
// 403: Forbidden sub-status | ||
public static final int FORBIDDEN_WRITEFORBIDDEN = 3; | ||
public static final int DATABASE_ACCOUNT_NOTFOUND = 1008; | ||
|
||
// 404: LSN in session token is higher | ||
public static final int READ_SESSION_NOT_AVAILABLE = 1002; | ||
|
||
// 410: Gone sub-status | ||
// Sub-status code indicating that a GoneException was instantiated by the client; not as a result of a response | ||
// message provide by a Cosmos instance. | ||
public static final int CLIENT_GENERATED = 10_000; | ||
|
||
// 410: Gone sub-status | ||
// Sub-status code zero in a response from a service endpoint indicates that a replica is being discontinued or | ||
// reconfigured. When endpoint rediscovery is enabled the RntbdTransportClient converts sub-status code zero to | ||
// this sub-status code value. | ||
public static final int DISCONTINUING_SERVICE = CLIENT_GENERATED + 2; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How doe these sub-status code impact GoneRetryPolicy? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Confirmed by way of the logs on test33 and also by code inspection. This does not effect retry policy. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when rntbd throws GoneException, Gone will reach GoneAndRetryWithRetryPolicy, and it will decide how it should get retried, based on status/code and substatus code. Please check the behaviour of GoneAndRetryWithRetryPolicy if for the given statuscode/substatus code it does the desired behabour or not. |
||
} | ||
|
||
public static class HeaderValues { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -156,13 +156,10 @@ public static final class Event { | |
@JsonIgnore | ||
private final Duration duration; | ||
|
||
@JsonSerialize(using = ToStringSerializer.class) | ||
private final long durationInMicroSec; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are we removing this? I found this very helpful. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is reported, just not stored as a value. Are you saying that it is useful for debugging? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as if so could you provide a sample on new rntbd request timeline diagnostics format, info? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The value will still be logged: |
||
|
||
@JsonProperty("eventName") | ||
private final String name; | ||
|
||
@JsonSerialize(using = ToStringSerializer.class) | ||
@JsonProperty("startTimeUTC") | ||
private final Instant startTime; | ||
|
||
public Event(final String name, final Instant from, final Instant to) { | ||
|
@@ -172,11 +169,12 @@ public Event(final String name, final Instant from, final Instant to) { | |
this.name = name; | ||
this.startTime = from; | ||
|
||
this.duration = from == null ? null : to == null ? Duration.ZERO : Duration.between(from, to); | ||
if(this.duration != null) { | ||
this.durationInMicroSec = duration.toNanos()/1000L; | ||
if (from == null) { | ||
this.duration = null; | ||
} else if (to == null) { | ||
this.duration = Duration.ZERO; | ||
David-Noble-at-work marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} else { | ||
this.durationInMicroSec = 0; | ||
this.duration = Duration.between(from, to); | ||
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand on what it is from CX perspective.
QQ; Do we want this feature flag as prominent in ConnectionConfig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good question whether we want this flag as prominent as it is. This PR now enables connection endpoint rediscovery by default. I would prefer not to advertise the feature until we've got more experience with it. Enabling the feature by default is counter to that. One might argue that enabling the feature by default with a highly visible option to turn the feature off is preferred because it's easier to do that in code than by using a less obvious mechanism (such as azure.cosmos.directTcp.defaultOptions).
@kushagraThapar, @moderakh what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, any feature may have bug on its first release. I prefer if we change the default to enabled later when we are more confident on the behaviour.