Reimplement the Lock middleware with tokio::sync #427

olix0r · 2020-02-17T18:12:42Z

As described in tokio-rs/tokio#2237, the tokio::sync::Semaphore can
hold unbounded memory, especially when the semaphor is being contested
and consumers drop interest. Unfortunately, this use case is common in
the proxy, especially when a destination service is unavailable and the
proxy is timing out requests.

This change reimplements the Lock middleware without using
tokio::sync::Semaphore. This implementation is in some ways more
naive and inefficient, but it appears to be better suited for the
proxy's needs. Specifically, waiters are stored in a LIFO stack, which
optimizes for minimizing latency. Under certain high-load scenarios,
this Lock could be forced to grow its waiters set without cleaning up
expired watchers. If this becomes a more serious concern, we could
change the implementation to use a FIFO queue of waiters.

As described in tokio-rs/tokio#2237, the `tokio::sync::Semaphore` can hold unbounded memory, especially when the semaphor is being contested and consumers drop interest. Unfortunately, this use case is common in the proxy, especially when a destination service is unavailable and the proxy is timing out requests. This change reimplements the Lock middleware without using `tokio::sync::Semaphore`. This implementation is in some ways more naive and inefficient, but it appears to be better suited for the proxy's needs. Specifically, waiters are stored in a LIFO stack, which optimizes for minimizing latency. Under certain high-load scenarios, this Lock could be forced to grow its waiters set without cleaning up expired watchers. If this becomes a more serious concern, we could change the implementation to use a FIFO queue of waiters.

olix0r · 2020-02-17T18:13:34Z

Though the lock isn't yet used in the proxy, I've tested another branch that uses the lock and confirmed that we don't see the unbounded memory growth observed with the previous version.

linkerd/lock/src/service.rs

linkerd/lock/src/shared.rs

hawkw

Overall, this looks great. The code is very clear and well-documented, so it was easy to review — thanks for that!

I had one big question about the potential to accumulate "dead" waiter pointers towards the back of the stack, especially under heavy load. Besides that, most of my comments were nits or typos; I feel pretty good about the approach here.

Since using a LIFO stack lets us bias for latency rather than fairness, I wonder if we might want to continue using this implementation in this specific use-case (service concurrency control) even after tokio_sync's memory leak is fixed, especially since this implementation is also much simpler. Might be worth some benchmarking!

hawkw · 2020-02-17T18:48:12Z

linkerd/lock/src/error.rs

+
+impl std::fmt::Display for Poisoned {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        write!(f, "poisoned")


very minor nit: i believe the write! macro can have a small amount of overhead vs Formatter::write_str when there's no string interpolation to perform. this probably doesn't matter in practice...

linkerd/lock/src/service.rs

linkerd/lock/src/shared.rs

hawkw · 2020-02-17T19:08:56Z

linkerd/lock/src/test.rs

+}
+
+#[test]
+fn fuzz() {


You may want to consider using loom for this test. loom 0.1.x has a mock version of AtomicTask, which can simulate spurious wakeups, and a mock version of std::thread, which will run every permutation of threads being preempted (with bounding).

Using loom rather than randomness would make this task deterministic, which could make debugging failures significantly easier.

Sounds cool! I'd rather not block the PR on this unless we have specific concerns.

hawkw · 2020-02-17T19:19:52Z

linkerd/lock/src/shared.rs

+        while let Some(waiter) = self.waiters.pop() {
+            if waiter.notify() {
+                trace!("Notified waiter");
+                return;
+            }
+        }


If I'm correctly understanding this strategy, I think we may still be potentially susceptible to some memory leakiness when a very large number of waiters are in the stack. Since this is a LIFO, if there are a lot of waiters on the lock, those towards the end of the stack may never be woken up. If the delay is long enough, those waiters will time out, causing the futures to be dropped and thus dropping the Arc-ed waker, but the Weak pointer will remain in the stack. If a lot of new waiters are always being pushed, those Weaks will remain at the back of the stack forever.

Since we're dropping the actual waker when a future is dropped, the size of the data remaining in the stack per waiter is significantly smaller than in tokio_sync's Semaphore (I believe a Weak is just a single pointer). However, it is still possible for the vec to have unbounded growth.

One thought is that we may want to periodically call Vec::retain or something, to drop any Notify instances in the stack where the waiter has been dropped. However, this is expensive — it requires iterating the whole vec — so for latency reasons, we probably don't want to do this when acquiring or releasing the lock. We could probably consider putting this work in a background task that will run when the lock is not busy?

We may also want to periodically call shrink_to_fit so that the actual allocated capacity of the vec is released when load decreases enough that we don't need as much capacity. However, we definitely don't want to do this too often, as we could essentially thrash allocations. If we're occasionally cleaning out dead waiters, vec capacity is probably much less of an issue...

Yep. This is all correct. My current assumption is that what we have is Good Enough. We could certainly try to address these issues, but solutions all come with complexity. We can only really hit this situation when the lock is always under contention. As soon as it becomes idle, the entire set of waiters will be cleaned up. It seems like it would be unlikely to encounter this in practice, though I'm sure that we could write a test that triggers this...

For now, I'd prefer to add a note about this and revisit if we see this type of usage in practice.

As soon as it becomes idle, the entire set of waiters will be cleaned up

Where does this happen? I must've missed it.

Oh, you mean that if the lock is released when there are no "live" waiters in the stack, we continue popping until we hit a "live" waiter, so if there are none, we'll pop everything. Yeah, that means we probably won't see issues around the stack's capacity except in extreme edge cases. 👍

(it might be nice if there was a comment pointing that out...)

Added a note onto the Shared docstring.

linkerd/lock/src/service.rs

linkerd/lock/src/shared.rs

linkerd/lock/src/service.rs

hawkw

given https://github.com/linkerd/linkerd2-proxy/pull/427/files#r380353352, I feel pretty good about the design of the lock used here, at least in the proxy's use-case!

kleimkuhler

Looks good! I agree that this implementation is clear to follow. From reading the conversations above I don't have any major concerns about the strategy.

linkerd/lock/src/service.rs

adleong

I have read this and understood it and it makes sense to me.

I believe there is an implicit contract that if you call Shared::poll_acquire then you must call poll_acquire again when notified or call release_waiter to release your interest. Lock does this correctly by calling release_waiter in the Drop impl, but it would be helpful to have these contacts called out explicitly. Failing to do so could cause deadlock.

Similarly, if you successfully acquire a value, you must eventually call release_and_notify. Dropping the acquired value will lead to deadlock as well. That's probably obvious but it's still something I had to mentally work through.

This release includes the results from continued profiling & performance analysis. In addition to modifying internals to prevent unwarranted memory growth, we've introduced new metrics to aid in debugging and diagnostics: a new `request_errors_total` metric exposes the number of requests that receive synthesized responses due to proxy errors; and a suite of `stack_*` metrics expose proxy internals that can help us identify unexpected behavior. --- * trace: update `tracing-subscriber` dependency to 0.2.1 (linkerd/linkerd2-proxy#426) * Reimplement the Lock middleware with tokio::sync (linkerd/linkerd2-proxy#427) * Add the request_errors_total metric (linkerd/linkerd2-proxy#417) * Expose the number of service instances in the proxy (linkerd/linkerd2-proxy#428) * concurrency-limit: Share a limit across Services (linkerd/linkerd2-proxy#429) * profiling: add benchmark and profiling scripts (linkerd/linkerd2-proxy#406) * http-box: Box HTTP payloads via middleware (linkerd/linkerd2-proxy#430) * lock: Generalize to protect a guarded value (linkerd/linkerd2-proxy#431)

olix0r requested a review from a team February 17, 2020 18:12

seanmonstar reviewed Feb 17, 2020

View reviewed changes

linkerd/lock/src/service.rs Outdated Show resolved Hide resolved

linkerd/lock/src/service.rs Outdated Show resolved Hide resolved

linkerd/lock/src/shared.rs Outdated Show resolved Hide resolved

linkerd/lock/src/shared.rs Outdated Show resolved Hide resolved

Review feedback/cleanup

9e273c3

olix0r requested review from a team and seanmonstar February 17, 2020 19:20

hawkw reviewed Feb 17, 2020

View reviewed changes

linkerd/lock/src/service.rs Outdated Show resolved Hide resolved

linkerd/lock/src/shared.rs Outdated Show resolved Hide resolved

hawkw reviewed Feb 17, 2020

View reviewed changes

linkerd/lock/src/service.rs Outdated Show resolved Hide resolved

hawkw reviewed Feb 17, 2020

View reviewed changes

linkerd/lock/src/service.rs Outdated Show resolved Hide resolved

hawkw approved these changes Feb 17, 2020

View reviewed changes

olix0r added 2 commits February 17, 2020 20:31

typos and commentary

4aed2a4

typo, because of course

9bdae76

kleimkuhler approved these changes Feb 17, 2020

View reviewed changes

linkerd/lock/src/service.rs Outdated Show resolved Hide resolved

adleong approved these changes Feb 17, 2020

View reviewed changes

seanmonstar approved these changes Feb 17, 2020

View reviewed changes

Clarify contract

d6c7f33

olix0r merged commit b2d3ed8 into master Feb 17, 2020

olix0r deleted the ver/lock-rewrite branch February 17, 2020 21:54

olix0r mentioned this pull request Feb 19, 2020

proxy: v2.86.0 linkerd/linkerd2#4075

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reimplement the Lock middleware with tokio::sync #427

Reimplement the Lock middleware with tokio::sync #427

olix0r commented Feb 17, 2020

olix0r commented Feb 17, 2020

hawkw left a comment

hawkw Feb 17, 2020

hawkw Feb 17, 2020

olix0r Feb 17, 2020

hawkw Feb 17, 2020

olix0r Feb 17, 2020

hawkw Feb 17, 2020

hawkw Feb 17, 2020 •

edited

Loading

olix0r Feb 17, 2020

hawkw left a comment

kleimkuhler left a comment

adleong left a comment

Reimplement the Lock middleware with tokio::sync #427

Reimplement the Lock middleware with tokio::sync #427

Conversation

olix0r commented Feb 17, 2020

olix0r commented Feb 17, 2020

hawkw left a comment

Choose a reason for hiding this comment

hawkw Feb 17, 2020

Choose a reason for hiding this comment

hawkw Feb 17, 2020

Choose a reason for hiding this comment

olix0r Feb 17, 2020

Choose a reason for hiding this comment

hawkw Feb 17, 2020

Choose a reason for hiding this comment

olix0r Feb 17, 2020

Choose a reason for hiding this comment

hawkw Feb 17, 2020

Choose a reason for hiding this comment

hawkw Feb 17, 2020 • edited Loading

Choose a reason for hiding this comment

olix0r Feb 17, 2020

Choose a reason for hiding this comment

hawkw left a comment

Choose a reason for hiding this comment

kleimkuhler left a comment

Choose a reason for hiding this comment

adleong left a comment

Choose a reason for hiding this comment

hawkw Feb 17, 2020 •

edited

Loading