Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test-only] Simulate dialogue clients against a DeterministicScheduler #348

Merged
merged 111 commits into from
Feb 17, 2020

Conversation

iamdanfox
Copy link
Contributor

@iamdanfox iamdanfox commented Feb 17, 2020

Before this PR

I want to conclusively be able to demonstrate all the bits of our RPC stack is expected to behave under various kinds of load, but at the moment we just have to eyeball code and try to construct the expected behaviour in our heads.

After this PR

==COMMIT_MSG==
A new :simulation project runs dialogue calls in a DeterministicScheduler and records the results.
==COMMIT_MSG==

The tests run instantly and are fully deterministic and reproducible :D

Example chart:

sample

Sorry this is such a big PR - I'd recommend just cloning it and running SimulationTest to get a sense of what happens, you can also fiddle with the numbers.

Possible downsides?

@iamdanfox iamdanfox changed the title Simulate dialogue clients against a DeterministicScheduler [test-only] Simulate dialogue clients against a DeterministicScheduler Feb 17, 2020
import java.util.Optional;
import java.util.function.Supplier;

final class GenericRefreshingChannel implements Channel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split this change to a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled out here: #352

@@ -0,0 +1 @@
success=58.3% client_mean=PT0.7228S server_cpu=PT4M49.12S received=400/400 codes={200=233, 500=167}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like github hides the pngs by default - gotta click the little "Display the rich diff" thing:

image

@@ -42,22 +42,24 @@
*/
final class ConcurrencyLimitedChannel implements LimitedChannel {
private static final Void NO_CONTEXT = null;
private static final Supplier<Long> SYSTEM_NANOTIME = System::nanoTime;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LongSupplier or a custom interface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out the netflix builder requires a Supplier<Long>, so I've used the same caffeine Ticker interface we use everywhere else.

// Do nothing
numScheduled++;
}
if (numScheduled > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (numScheduled > 1) {
if (log.isDebugEnabled()) {

zero and one are also helpful data points imo.

}

private Stream<ScheduledRequest> infiniteRequests(Duration interval) {
return Stream.iterate(0, current -> current + 1).map(number -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: IntStream. 2m requests isn't a ton, may be worth using longs here?


private static Request constructRequest(int number) {
return Request.builder()
.putHeaderParams("X-B3-TraceId", "req-" + number)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X-B3-TraceId overloading is a bit funky, perhaps we should either use the existing tracing channels with random IDs, or use a simulation-specific header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - I'll probably want to visualize traces for these at some point, have switched to a different constant.

import com.palantir.dialogue.Response;
import java.util.concurrent.TimeUnit;

final class HistogramChannel implements Channel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use our standard instrumented channel with an AbstractTaggedMetricRegistry implementation that overrides timerSupplier and uses a custom reservoir supplier? Would be nice to reinforce that our metrics are good enough, but may not be worthwhile here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will happily do that - the only reason I didn't initially was there are a lot of methods on TaggedMetricRegistry and it was gonna inflate the line count of this PR quite severely!

return future.get();
}

return Futures.immediateFailedFuture(new SafeRuntimeException("limited channel says no :("));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think queued channel should do this #351

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was using this as a substitute for the queued channel, but it's not being used right now, so will delete and bring it back when needed.

}

/** This is an alternative to the {@link com.palantir.dialogue.core.QueuedChannel}. */
static Channel dontTolerateLimits(LimitedChannel limitedChannel) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used?

Copy link
Contributor

@carterkozak carterkozak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed this as closely as changes to prod code, but the output is insightful and the implementation looks reasonable.

@bulldozer-bot bulldozer-bot bot merged commit ffa5d1b into develop Feb 17, 2020
@bulldozer-bot bulldozer-bot bot deleted the dfox/simulation branch February 17, 2020 19:37
@iamdanfox
Copy link
Contributor Author

iamdanfox commented Feb 17, 2020

Thanks carter, I'm gonna update my other PRs with this code and hopefully have some pretty graphs to show!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants