Review process to abort UDP request when the ring buffer is full #842

josecelano · 2024-05-07T16:19:41Z

Relates to: #830

    async fn run_udp_server(tracker: Arc<Tracker>, socket: Arc<UdpSocket>) {
        let tracker = tracker.clone();
        let socket = socket.clone();

        let reqs = &mut ActiveRequests::default();

        // Main Waiting Loop, awaits on async [`receive_request`].
        loop {
            if let Some(h) = reqs.rb.push_overwrite(
                Self::spawn_request_processor(Self::receive_request(socket.clone()).await, tracker.clone(), socket.clone())
                    .abort_handle(),
            ) {
                if !h.is_finished() {
                    // the task is still running, lets yield and give it a chance to flush.
                    tokio::task::yield_now().await;

                    h.abort();

                    let server_socket_addr = socket.local_addr().expect("Could not get local_addr for socket.");

                    tracing::span!(
                        target: "UDP TRACKER",
                        tracing::Level::WARN, "request-aborted", server_socket_addr = %server_socket_addr);
                }
            }
        }
    }

We are using the push_overwrite method.

That method overwrites the latest item, not the oldest. Does that make sense in our case? Would not be better if we overwrite the oldest request? That's the one that has had a longer time to be processed.

cc @da2ce7

The text was updated successfully, but these errors were encountered:

da2ce7 · 2024-05-07T18:17:55Z

@josecelano I think that you are correct:

da2ce7@ff9b43d

This commit should fix the problem. However now I don't see any advantage of using the ring-buffer over a standard vector. Since we are not inserting-or-removing entries from the buffer, but just swapping them for new entries.

josecelano · 2024-05-08T09:15:05Z

@josecelano I think that you are correct:

da2ce7@ff9b43d

This commit should fix the problem. However now I don't see any advantage of using the ring-buffer over a standard vector. Since we are not inserting-or-removing entries from the buffer, but just swapping them for new entries.

Hi @da2ce7, after emerging the issue:

Monitor how many UDP requests are aborted when the maximum number of concurrent requests is reached #830

I restarted the live demo. Now, I can get this graph with Datadog by parsing the "request-aborted" lines in the logs.

Regarding your patch, I guess now the socket acts as a buffer for all incoming request, and we only handle them when the previous ones have finished. I suppose that will consume more memory, but It's a trade-off between rejecting the requests or consuming more memory and increasing the response time.

NOTE: Remember, we still don't have a timeout for processing the UDP requests.

I have to think about this, but I prefer directly controlling that buffer in the application. I mean, I would read all incoming packets as soon as possible and keep those pending tasks in the app. That way, we could implement policies to handle the overload. For example, we could:

Reject new requests when the buffer is full. With the current solution, we don't know when it is full (I guess so) or if we did it would be less readable.
Ignore new requests when the buffer is full. We don't even send a response in the current version. That doesn't have to be bad. Maybe it's also a way to save bandwidth, and with UDP, the client shouldn't have a problem with that.

Maybe a more common solution would be more straightforward for newcomers to understand. For example, we could have a pool of workers processing requests. The main loop could get all incoming requests from the socket and send them to the workers by using a channel. The channel would be the buffer. Maybe under the hood, this implementation is the same, but it has two advantages:

The app controls the buffer. We can probably get info from the socket buffer, but I guess the API is going to be harder to understand.
The worker's pattern is well-known and documented. ChatGPT can create the code for that :-) and can explain to other developers.

Finally, limiting the number of active requests with the current version (without your new patch) is a way of limiting resources. In this case, we limit CPU usage (not even sending a "busy" response to clients) and memory usage because task processors consume memory. I'm not sure if we should limit the resources without providing a way to monitor the actual server load. I mean, this limitation was probably hiding an overload problem. I have been monitoring the memory consumption in the demo, and I didn't see this problem (we were rejecting a lot of requests). I only saw a CPU problem when it was caused by the swapping. In conclusion, I think we should disable all limits (your new path disables them, unless there is a limit for the socket buffer) and monitor the servers to scale up. If we want to introduce some kind of limitations we have to think how to monitor that limitation so sysadmins can detect overload.

I've been discussing these limitations here.

I will merge your patch for the time being.

da2ce7 · 2024-05-08T10:12:01Z

@josecelano there is was a bug in my patch that enabled the test to fail, now I yield before I return then new "filled" ring-buffer.

Here is the updated patch:

da2ce7@d030030

josecelano · 2024-05-09T09:12:03Z

Just for the record, this is the Datadog dashboard where I put the number of handled requests (below) and the number of aborted requests (top)

josecelano changed the title ~~Review process to abort UDP request when the ringbuffer is full~~ Review process to abort UDP request when the ring buffer is full May 7, 2024

josecelano added the Question / Discussion Community Feedback label May 7, 2024

josecelano assigned da2ce7 May 7, 2024

josecelano added this to the v3.0.0 milestone May 7, 2024

josecelano linked a pull request May 8, 2024 that will close this issue

dev: fix udp ring-buffer not looping #846

Merged

josecelano closed this as completed in #846 May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review process to abort UDP request when the ring buffer is full #842

Review process to abort UDP request when the ring buffer is full #842

josecelano commented May 7, 2024

da2ce7 commented May 7, 2024 •

edited

Loading

josecelano commented May 8, 2024

da2ce7 commented May 8, 2024

josecelano commented May 9, 2024

Review process to abort UDP request when the ring buffer is full #842

Review process to abort UDP request when the ring buffer is full #842

Comments

josecelano commented May 7, 2024

da2ce7 commented May 7, 2024 • edited Loading

josecelano commented May 8, 2024

da2ce7 commented May 8, 2024

josecelano commented May 9, 2024

da2ce7 commented May 7, 2024 •

edited

Loading