-
Notifications
You must be signed in to change notification settings - Fork 14
feat: support multiple channels in SessionPool #1063
Conversation
* SessionPool now utilizes all channels that are given to it, creating new sessions on the least loaded channel. * The default number of channels created has been changed to 4. * Refactor Session creation and add to pool logic. * Add some tests that use multipe channels. * Rework some tests that depended on the order of returned Sessions. Part of #307
Codecov Report
@@ Coverage Diff @@
## master #1063 +/- ##
=========================================
+ Coverage 91.59% 92.09% +0.5%
=========================================
Files 162 165 +3
Lines 11785 12117 +332
=========================================
+ Hits 10794 11159 +365
+ Misses 991 958 -33
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool, thanks!
Reviewed 5 of 5 files at r1.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @devbww, @devjgm, @mr-salty, and @scotthart)
google/cloud/spanner/connection_options.h, line 86 at r1 (raw file):
* parallel. * * The default value is 4.
FYI. And no change needed in this PR: I am not sure we want to hard-code this value, maybe the application developers want to change that. Also I think we want this to scale with the number of cores, we do that in Bigtable, and it kind of makes sense because more cores == more potential calls in parallel. In any case, please open a bug (probably for post-BETA) to fine tune this value.
google/cloud/spanner/internal/session_pool.h, line 157 at r1 (raw file):
std::vector<ChannelInfo> channels_; ChannelInfo* least_loaded_channel_;
It seems that this is used for session creation and when the session does not have an associated stub already. Consider randomly picking a channel / stub in those cases, which is less likely to fall into thundering herd problems.
google/cloud/spanner/internal/session_pool.cc, line 74 at r1 (raw file):
int num_channels = static_cast<int>(channels_.size()); int sessions_per_channel = options_.min_sessions / num_channels; // if it doesn't divide evenly, add the remainder to the first channel.
That does not work well when you have a lot of channels (say num_channels=20) and you have K*num_channels - 1
sessions. Maybe decrement extra_sessions
until it gets to zero and add one session to each of the channels if extra_sessions != 0
?
google/cloud/spanner/internal/session_pool.cc, line 144 at r1 (raw file):
// Sessions that were created for partitioned Reads/Queries do not have // their own stub, so return one to use. return least_loaded_channel_->stub;
Doesn't this implicitly change the load on that stub? It is now likely to be used, right?
google/cloud/spanner/internal/session_pool.cc, line 199 at r1 (raw file):
channel.session_count += sessions_created; total_sessions_ += sessions_created; // TODO(#307) instead of adding all of these to the end, we could insert
Consider just reshuffling after this insert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 5 files at r1.
Reviewable status: 3 of 5 files reviewed, 12 unresolved discussions (waiting on @coryan, @devbww, @devjgm, and @scotthart)
google/cloud/spanner/connection_options.h, line 86 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
FYI. And no change needed in this PR: I am not sure we want to hard-code this value, maybe the application developers want to change that. Also I think we want this to scale with the number of cores, we do that in Bigtable, and it kind of makes sense because more cores == more potential calls in parallel. In any case, please open a bug (probably for post-BETA) to fine tune this value.
I filed #1077 to cover all the options.
google/cloud/spanner/internal/session_pool.h, line 157 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
It seems that this is used for session creation and when the session does not have an associated stub already. Consider randomly picking a channel / stub in those cases, which is less likely to fall into thundering herd problems.
I split this into two renamed variables and round-robin in the latter case as noted elsewhere
google/cloud/spanner/internal/session_pool.h, line 157 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
Perhaps something like
fewest_sessions_channel_
might be clearer?
WDYT now?
google/cloud/spanner/internal/session_pool.cc, line 74 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
That does not work well when you have a lot of channels (say num_channels=20) and you have
K*num_channels - 1
sessions. Maybe decrementextra_sessions
until it gets to zero and add one session to each of the channels ifextra_sessions != 0
?
Done, thanks. I was afraid of making it too complicated but your suggestion is pretty simple.
google/cloud/spanner/internal/session_pool.cc, line 144 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
Doesn't this implicitly change the load on that stub? It is now likely to be used, right?
yes, but note if we end up here we're also using a dissociated session, so the pool gets no indication when the caller is done using it.
A while back @devbww and I did discuss whether it was worthwhile to try to track this, i.e. instead of using the default deleter for these Sesssions like we do now, make them use a custom deleter than updates the pool statistics but doesn't return the session to the pool. We decided it wasn't worth the complication, but noted that we could do it in the future if needed since it's not user-visible either way. Since we're only approximately distributing Sessions across channels, it doesn't seem worthwhile, although I think I'm convinced it's better to just round-robin through the sessions here instead of using least_loaded_, so we'll be guaranteed to spread out the load.
FYI I realized I was missing a lock here and added one.
google/cloud/spanner/internal/session_pool.cc, line 199 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
Consider just reshuffling after this insert.
Done
google/cloud/spanner/internal/session_pool.cc, line 40 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
If we're going to say ...
options.max_sessions = (std::max)(options.max_sessions, 1);
instead of ...
if (options.max_sessions < 1) { options.max_sessions = 1; }
we might as well do the same thing here:
options.max_sessions = (std::max)(options.max_sessions, options.min_sessions);
Done.
google/cloud/spanner/internal/session_pool.cc, line 74 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
Why choose to do that? Would not things be nicer with 1 extra session on each of the first
extra_sessions
channels, rather thanextra_sessions
sessions on the first channel? I guess the way we add new sessions in blobs to a single channel makes this moot?
I changed it to make it more even, but you are correct that it can and will get out of balance as we add sessions - the goal is just to try to approximately balance the Sessions across channels.
google/cloud/spanner/internal/session_pool.cc, line 144 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
Why use the channel with the fewest sessions?
(see above)
google/cloud/spanner/internal/session_pool.cc, line 161 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
Can we refactor somehow so that the caller drop/retakes the lock? That is, so that all the "inside the lock" stuff is in
Allocate()
?
The AddSessionsToPool
stuff needs to happen under the lock. I could move that outside, although I'd just moved it inside because after we create the sessions there's really nothing to be done with them but add them to the pool. I'm also wondering if I should inline AddSessionsToPool
because it's unlikely to have any other caller than CreateSessions
, and in fact we could avoid an extra trip through a vector if I did that - so I decided to do it...
google/cloud/spanner/internal/session_pool.cc, line 167 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
Does any of this need to happen inside the lock?
no
google/cloud/spanner/internal/session_pool.cc, line 193 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
Nit: s/adds/Adds/
Done
google/cloud/spanner/internal/session_pool.cc, line 194 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
AddSessionsToChannel()
perhaps?
It's really adding the Sessions to the the pool... the channel is just for statistics (as the comment says).
google/cloud/spanner/internal/session_pool.cc, line 204 at r2 (raw file):
Previously, devbww (Bradley White) wrote…
It looks like this is unnecessary unless
least_loaded_channel_ == &channel
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r3.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @devbww, @devjgm, and @scotthart)
google/cloud/spanner/internal/session_pool.cc, line 207 at r3 (raw file):
void SessionPool::UpdateNextChannelForCreateSessions() { next_channel_for_create_sessions_ = &channels_[0];
Presumably it is impossible for channels_
to be empty? Maybe a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @devbww, @devjgm, and @scotthart)
google/cloud/spanner/internal/session_pool.h, line 160 at r3 (raw file):
Previously, devbww (Bradley White) wrote…
Opt: Consider that both these fields identify an element of
channels_
, but we choose different types for each, and neither isstd::vector<ChannelInfo>::iterator
.
Done.
google/cloud/spanner/internal/session_pool.cc, line 59 at r3 (raw file):
Previously, devbww (Bradley White) wrote…
Q: Do we know
!stubs.empty()
?
I just legislated it by comment.
I had changed the public methods to not crash, but then I realized I also had to do some special-case stuff in the constructor, so I just gave up and went with the comment... can we assert
here?
google/cloud/spanner/internal/session_pool.cc, line 207 at r3 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
Presumably it is impossible for
channels_
to be empty? Maybe a comment?
@devbww commented on this above as well (in the constructor), let's comment further there if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @devbww, @devjgm, @mr-salty, and @scotthart)
google/cloud/spanner/internal/session_pool.h, line 157 at r2 (raw file):
Previously, mr-salty (Todd Derr) wrote…
WDYT now?
FYI: Iterators become invalidated by a lot of things, while indices are not... I searched for any place that might have invalidated the iterators and it all looks correct now. It is easier to break in the future though.
google/cloud/spanner/internal/session_pool.cc, line 58 at r4 (raw file):
backoff_policy_prototype_(std::move(backoff_policy)), options_(SanitizeOptions(options)) { channels_.reserve(stubs.size());
I would do something like:
if (stubs.empty()) google::cloud::internal::ThrowInvalidArgument("SessionPool requires a non-empty set of stubs");
That way we get an earlier error if somebody bypasses your comment.
google/cloud/spanner/internal/session_pool.cc, line 146 at r4 (raw file):
std::unique_lock<std::mutex> lk(mu_); stub = next_dissociated_stub_channel_->stub; if (++next_dissociated_stub_channel_ == channels_.end()) {
FYI: this code is assuming that channels_
is not empty too, just in a more subtle way, I recall that calling ++
on an .end()
iterator is not a great thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 11 unresolved discussions (waiting on @coryan, @devbww, @devjgm, and @scotthart)
google/cloud/spanner/internal/session_pool.h, line 157 at r2 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
FYI: Iterators become invalidated by a lot of things, while indices are not... I searched for any place that might have invalidated the iterators and it all looks correct now. It is easier to break in the future though.
I added a TODO about switching to absl::FixedArray.
We discussed some alternatives that would make this "safe" now (vector<ChannelInfo*> const or a plain new[] allocated array) but decided it was preferable to keep it as-is to make the migration easier. I also moved the actual creation out to
google/cloud/spanner/internal/session_pool.cc, line 58 at r4 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
I would do something like:
if (stubs.empty()) google::cloud::internal::ThrowInvalidArgument("SessionPool requires a non-empty set of stubs");
That way we get an earlier error if somebody bypasses your comment.
Done.
google/cloud/spanner/internal/session_pool.cc, line 146 at r4 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
FYI: this code is assuming that
channels_
is not empty too, just in a more subtle way, I recall that calling++
on an.end()
iterator is not a great thing.
now that we throw the exception we can safely assume it's non-empty, unless you think it's worth special-casing everywhere we use or increment one of these iterators. I added a comment in the .h file about this and iterator validity (and also a TODO to use absl::FixedArray which is not resizeable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r5.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @devbww, @devjgm, and @scotthart)
…ud-cpp-spanner#1063) * feat: support multiple channels in SessionPool * SessionPool now utilizes all channels that are given to it, creating new sessions on the least loaded channel. * The default number of channels created has been changed to 4. * Refactor Session creation and add to pool logic. * Add some tests that use multipe channels. * Rework some tests that depended on the order of returned Sessions. Part of googleapis/google-cloud-cpp-spanner#307 * address review comments * address review comments * address review comments * very important backtick change
Part of #307
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)