Skip to content

Commit

Permalink
Client respect Node-Selection-Strategy response header (#688)
Browse files Browse the repository at this point in the history
Clients respect `Node-Selection-Strategy` response header
  • Loading branch information
ferozco authored May 1, 2020
1 parent a87cb3a commit 6651f29
Show file tree
Hide file tree
Showing 11 changed files with 405 additions and 43 deletions.
18 changes: 17 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,11 +113,27 @@ _This API is influenced by gRPC's [Java library](https://github.com/grpc/grpc-ja

## Behaviour

### Concurrency Limits
### Concurrency limits
Each host has an [AIMD](https://en.wikipedia.org/wiki/Additive_increase/multiplicative_decrease) concurrency limit. This protects
servers by stopping requests getting out the door on the client-side. Permits are multiplicatively decreased after
receiving any 5xx, 429 or 308 response. Otherwise, they are additively increased.

### Node selection strategies
When configured with multiple uris, Dialogue has several strategies for choosing which upstream to route requests to.
The default strategy is `PIN_UNTIL_ERROR`, although users can choose alternatives such as `ROUND_ROBIN` when building a ClientConfiguration
object. Note that the choice of an appropriate strategy usually depends on the _upstream_ server's behaviour, i.e. if its
performance relies heavily on warm caches, or if successive requests must land on the same node to successfully complete
a transaction. To solve this problem without needing code changes in all clients, servers can recommend a
NodeSelectionStrategy (see below).

### Server-recommended node selection strategies
Servers can inform clients of their recommended strategies by including the
`Node-Selection-Strategy` response header. Values are separated by commas and are ordered by preference. See [available strategies](dialogue-core/src/main/java/com/palantir/dialogue/core/DialogueNodeSelectionStrategy.java).
```
Node-Selection-Strategy: BALANCED,PIN_UNTIL_ERROR
```
When the header is present, it takes precedence over user-selected strategies. Servers are free to omit this value.

### NodeSelectionStrategy.ROUND_ROBIN
Used to balance requests across many servers better than the
default PIN_UNTIL_ERROR. The actual algorithm has evolved from naive Round Robin, then to Random Selection and now
Expand Down
6 changes: 6 additions & 0 deletions changelog/@unreleased/pr-688.v2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
type: feature
feature:
description: |
Clients respect the optional `Node-Selection-Strategy` response header, which takes precedence over user-selected strategies.
links:
- https://github.com/palantir/dialogue/pull/688
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ private void updateUrisInner(Collection<String> uris, boolean firstTime) {
.addAll(limitedChannelByUri.keySet())
.addAll(newUris)
.build();

newUris.forEach(uri -> {
Config configWithUris = withUris(cf, allUris); // necessary for attribute metrics to the right hostIndex
LimitedChannel singleUriChannel = createPerUriChannel(configWithUris, uri);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
/*
* (c) Copyright 2020 Palantir Technologies Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.palantir.dialogue.core;

import com.google.common.base.Splitter;
import com.google.common.collect.ImmutableList;
import com.palantir.conjure.java.client.config.NodeSelectionStrategy;
import com.palantir.logsafe.SafeArg;
import com.palantir.logsafe.exceptions.SafeIllegalStateException;
import java.util.List;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
* Supported node selection strategies which can either be user provided or received over the wire from servers.
* Separate from {@link NodeSelectionStrategy} to allow us to more easily iterate on strategies and support unknown
* strategies coming in over the wire.
*/
enum DialogueNodeSelectionStrategy {
PIN_UNTIL_ERROR,
PIN_UNTIL_ERROR_WITHOUT_RESHUFFLE,
BALANCED,
UNKNOWN;

private static final Logger log = LoggerFactory.getLogger(DialogueNodeSelectionStrategy.class);
private static final Splitter SPLITTER = Splitter.on(",").trimResults().omitEmptyStrings();

static List<DialogueNodeSelectionStrategy> fromHeader(String header) {
return SPLITTER.splitToStream(header)
.map(DialogueNodeSelectionStrategy::safeValueOf)
.collect(ImmutableList.toImmutableList());
}

private static DialogueNodeSelectionStrategy safeValueOf(String value) {
String normalizedValue = value.toUpperCase();
if (PIN_UNTIL_ERROR.name().equals(normalizedValue)) {
return PIN_UNTIL_ERROR;
} else if (PIN_UNTIL_ERROR_WITHOUT_RESHUFFLE.name().equals(normalizedValue)) {
return PIN_UNTIL_ERROR_WITHOUT_RESHUFFLE;
} else if (BALANCED.name().equals(normalizedValue)) {
return BALANCED;
}

log.info("Received unknown selection strategy", SafeArg.of("strategy", value));
return UNKNOWN;
}

static DialogueNodeSelectionStrategy of(NodeSelectionStrategy strategy) {
switch (strategy) {
case PIN_UNTIL_ERROR:
return PIN_UNTIL_ERROR;
case PIN_UNTIL_ERROR_WITHOUT_RESHUFFLE:
return PIN_UNTIL_ERROR_WITHOUT_RESHUFFLE;
case ROUND_ROBIN:
return BALANCED;
}
throw new SafeIllegalStateException("Unknown node selection strategy", SafeArg.of("strategy", strategy));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -17,48 +17,64 @@
package com.palantir.dialogue.core;

import com.github.benmanes.caffeine.cache.Ticker;
import com.google.common.annotations.VisibleForTesting;
import com.google.common.collect.ImmutableList;
import com.google.common.util.concurrent.FutureCallback;
import com.google.common.util.concurrent.ListenableFuture;
import com.palantir.conjure.java.client.config.NodeSelectionStrategy;
import com.palantir.dialogue.Endpoint;
import com.palantir.dialogue.Request;
import com.palantir.dialogue.Response;
import com.palantir.logsafe.SafeArg;
import com.palantir.logsafe.exceptions.SafeRuntimeException;
import com.palantir.tritium.metrics.registry.TaggedMetricRegistry;
import java.util.List;
import java.util.Optional;
import java.util.Random;
import java.util.concurrent.atomic.AtomicReference;
import org.checkerframework.checker.nullness.qual.Nullable;
import org.immutables.value.Value;

@SuppressWarnings("NullAway")
final class NodeSelectionStrategyChannel implements LimitedChannel {
private final AtomicReference<LimitedChannel> nodeSelectionStrategy;
private static final String NODE_SELECTION_HEADER = "Node-Selection-Strategy";

private final FutureCallback<Response> callback = new NodeSelectionCallback();

private final AtomicReference<NodeSelectionChannel> nodeSelectionStrategy;
private final NodeSelectionStrategyChooser strategySelector;

private final NodeSelectionStrategy strategy;
private final String channelName;
private final Random random;
private final Ticker tick;
private final TaggedMetricRegistry metrics;
private final LimitedChannel delegate;
private final DialogueNodeselectionMetrics nodeSelectionMetrics;

private NodeSelectionStrategyChannel(
NodeSelectionStrategy strategy,
@VisibleForTesting
NodeSelectionStrategyChannel(
NodeSelectionStrategyChooser strategySelector,
DialogueNodeSelectionStrategy initialStrategy,
String channelName,
Random random,
Ticker tick,
TaggedMetricRegistry metrics) {
this.strategy = strategy;
this.strategySelector = strategySelector;
this.channelName = channelName;
this.random = random;
this.tick = tick;
this.metrics = metrics;
this.nodeSelectionStrategy = new AtomicReference<>();
this.delegate = new SupplierChannel(nodeSelectionStrategy::get);
this.nodeSelectionMetrics = DialogueNodeselectionMetrics.of(metrics);
this.nodeSelectionStrategy = new AtomicReference<>(NodeSelectionChannel.builder()
.strategy(initialStrategy)
.channel(new ZeroUriNodeSelectionChannel(channelName))
.build());
this.delegate = new SupplierChannel(() -> nodeSelectionStrategy.get().channel());
}

static NodeSelectionStrategyChannel create(Config cf) {
return new NodeSelectionStrategyChannel(
cf.clientConf().nodeSelectionStrategy(),
NodeSelectionStrategyChannel::getFirstKnownStrategy,
DialogueNodeSelectionStrategy.of(cf.clientConf().nodeSelectionStrategy()),
cf.channelName(),
cf.random(),
cf.ticker(),
Expand All @@ -67,30 +83,56 @@ static NodeSelectionStrategyChannel create(Config cf) {

@Override
public Optional<ListenableFuture<Response>> maybeExecute(Endpoint endpoint, Request request) {
return delegate.maybeExecute(endpoint, request);
return delegate.maybeExecute(endpoint, request).map(this::wrapWithCallback);
}

private ListenableFuture<Response> wrapWithCallback(ListenableFuture<Response> response) {
return DialogueFutures.addDirectCallback(response, callback);
}

void updateChannels(ImmutableList<LimitedChannel> updatedChannels) {
nodeSelectionStrategy.getAndUpdate(channel -> getUpdatedNodeSelectionStrategy(
channel, updatedChannels, strategy, metrics, random, tick, channelName));
nodeSelectionStrategy.getAndUpdate(prevChannel -> getUpdatedSelectedChannel(
prevChannel.channel(), updatedChannels, prevChannel.strategy(), metrics, random, tick, channelName));
}

private static LimitedChannel getUpdatedNodeSelectionStrategy(
private void updateRequestedStrategies(List<DialogueNodeSelectionStrategy> strategies) {
Optional<DialogueNodeSelectionStrategy> maybeStrategy = strategySelector.updateAndGet(strategies);
if (maybeStrategy.isPresent()) {
DialogueNodeSelectionStrategy strategy = maybeStrategy.get();
// Quick check to avoid expensive CAS
if (strategy.equals(nodeSelectionStrategy.get().strategy())) {
return;
}

this.nodeSelectionMetrics
.strategy()
.channelName(channelName)
.strategy(strategy.toString())
.build()
.mark();
nodeSelectionStrategy.getAndUpdate(prevChannel -> getUpdatedSelectedChannel(
prevChannel.channel(), prevChannel.hostChannels(), strategy, metrics, random, tick, channelName));
}
}

private static NodeSelectionChannel getUpdatedSelectedChannel(
@Nullable LimitedChannel previousNodeSelectionStrategy,
ImmutableList<LimitedChannel> channels,
NodeSelectionStrategy updatedStrategy,
DialogueNodeSelectionStrategy updatedStrategy,
TaggedMetricRegistry metrics,
Random random,
Ticker tick,
String channelName) {

NodeSelectionChannel.Builder channelBuilder =
NodeSelectionChannel.builder().strategy(updatedStrategy).hostChannels(channels);
if (channels.isEmpty()) {
return new ZeroUriNodeSelectionChannel(channelName);
return channelBuilder
.channel(new ZeroUriNodeSelectionChannel(channelName))
.build();
}

if (channels.size() == 1) {
// no fancy node selection heuristic can save us if our one node goes down
return channels.get(0);
return channelBuilder.channel(channels.get(0)).build();
}

switch (updatedStrategy) {
Expand All @@ -101,21 +143,68 @@ private static LimitedChannel getUpdatedNodeSelectionStrategy(
if (previousNodeSelectionStrategy instanceof PinUntilErrorNodeSelectionStrategyChannel) {
PinUntilErrorNodeSelectionStrategyChannel previousPinUntilError =
(PinUntilErrorNodeSelectionStrategyChannel) previousNodeSelectionStrategy;
return PinUntilErrorNodeSelectionStrategyChannel.of(
Optional.of(previousPinUntilError.getCurrentChannel()),
updatedStrategy,
channels,
pinuntilerrorMetrics,
random,
channelName);
return channelBuilder
.channel(PinUntilErrorNodeSelectionStrategyChannel.of(
Optional.of(previousPinUntilError.getCurrentChannel()),
updatedStrategy,
channels,
pinuntilerrorMetrics,
random,
channelName))
.build();
}
return PinUntilErrorNodeSelectionStrategyChannel.of(
Optional.empty(), updatedStrategy, channels, pinuntilerrorMetrics, random, channelName);
case ROUND_ROBIN:
return channelBuilder
.channel(PinUntilErrorNodeSelectionStrategyChannel.of(
Optional.empty(), updatedStrategy, channels, pinuntilerrorMetrics, random, channelName))
.build();
case BALANCED:
// When people ask for 'ROUND_ROBIN', they usually just want something to load balance better.
// We used to have a naive RoundRobinChannel, then tried RandomSelection and now use this heuristic:
return new BalancedNodeSelectionStrategyChannel(channels, random, tick, metrics, channelName);
return channelBuilder
.channel(new BalancedNodeSelectionStrategyChannel(channels, random, tick, metrics, channelName))
.build();
case UNKNOWN:
}
throw new SafeRuntimeException("Unknown NodeSelectionStrategy", SafeArg.of("unknown", updatedStrategy));
}

@VisibleForTesting
static Optional<DialogueNodeSelectionStrategy> getFirstKnownStrategy(
List<DialogueNodeSelectionStrategy> strategies) {
for (DialogueNodeSelectionStrategy strategy : strategies) {
if (!strategy.equals(DialogueNodeSelectionStrategy.UNKNOWN)) {
return Optional.of(strategy);
}
}
return Optional.empty();
}

@Value.Immutable
interface NodeSelectionChannel {
DialogueNodeSelectionStrategy strategy();

LimitedChannel channel();

ImmutableList<LimitedChannel> hostChannels();

class Builder extends ImmutableNodeSelectionChannel.Builder {}

static Builder builder() {
return new Builder();
}
}

private final class NodeSelectionCallback implements FutureCallback<Response> {
@Override
public void onSuccess(Response result) {
result.getFirstHeader(NODE_SELECTION_HEADER).ifPresent(this::consumeStrategy);
}

@Override
public void onFailure(Throwable _unused) {}

private void consumeStrategy(String strategy) {
updateRequestedStrategies(DialogueNodeSelectionStrategy.fromHeader(strategy));
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
/*
* (c) Copyright 2020 Palantir Technologies Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.palantir.dialogue.core;

import java.util.List;
import java.util.Optional;

interface NodeSelectionStrategyChooser {
Optional<DialogueNodeSelectionStrategy> updateAndGet(List<DialogueNodeSelectionStrategy> updatedStrategies);
}
Loading

0 comments on commit 6651f29

Please sign in to comment.