Denial of Service attack factor #324

Power2All · 2023-05-12T13:24:16Z

Since I've been using Axum on my own project based on Torrust-Tracker, I'm moving myself to Actix framework again, but now with properly implementation.

This server uses, as far as I'm aware, still the Hyper/Warp framework, which is missing the basic function for Client Connection and Client Disconnection timeout.

You could verify if the code is affected, by opening a command prompt (or linux the terminal), and telnet to the API or HTTP port of Torrust-Tracker. If it works correctly, this should disconnect within a few seconds, or if you customized this timeout a bit longer, but normally by default this should be no more then 5 seconds at the most (reading Warp documentation, there is a default timeout of 30 seconds).

If this is not killing the connection, could be seen as a critical security flaw, and should be addressed. I haven't tested it out with the current code, but could in a minute, modifying this post.

josecelano · 2023-05-15T15:47:42Z

Thank you @Power2All. I've just tested it and it seems there is no timeout, but I guess there must be a way to do it with Axum. I'll check it. Do you have more info now?

Links

How to set timeouts? tokio-rs/axum#1383

Power2All · 2023-05-15T16:16:21Z

Thank you @Power2All. I've just tested it and it seems there is no timeout, but I guess there must be a way to do it with Axum. I'll check it. Do you have more info now?

Links

How to set timeouts? tokio-rs/axum#1383

There is, but it's half-tested.
Decided to switch to Actix again, and that seems to be implemented correctly. Also added ratelimit, as the RwLock was making deadlocks on busy times.

josecelano · 2023-05-15T17:57:03Z

Thank you @Power2All. I've just tested it and it seems there is no timeout, but I guess there must be a way to do it with Axum. I'll check it. Do you have more info now?
Links

How to set timeouts? tokio-rs/axum#1383

There is, but it's half-tested. Decided to switch to Actix again, and that seems to be implemented correctly. Also added ratelimit, as the RwLock was making deadlocks on busy times.

Thank you @Power2All!!

Timeout

It seems the middleware always returns 408, which means:

The HyperText Transfer Protocol (HTTP) 408 Request Timeout response status code means that the server would like to shut down this unused connection. It is sent on an idle connection by some servers, even without any previous request by the client.

A server should send the "close" Connection header field in the response, since 408 implies that the server has decided to close the connection rather than continue waiting.

This response is used much more since some browsers, like Chrome, Firefox 27+, and IE9, use HTTP pre-connection mechanisms to speed up surfing.

It seems there could be some cases where a 504 could be the reason, which means the reason for the timeout would be the server and not the inactive client.

They are discussing how to handle this ambiguity: tower-rs/tower-http#300.

Is that what you mean "half-tested"?

In our particular case, I think that's not a problem because we do not have a proxy. The tracker does not use a proxy. It could be the opposite, the tracker could be behind a proxy.

Anyway, if we confirm that it could be a problem, I would prefer to just implement our own middleware and change the response status code or fully test it, instead of migrating back to actix-web only for one feature. OR at least I will postpone it, because as you mentioned we anyway have other DoS risks like the rate limit.

Rate limit

I agree, a rate limit is something we need. But maybe for the time being you can handle that with a proxy.

RwLock

Can you elaborate on which cases RwLock was making deadlocks? Maybe for settings? I also noticed this could be a problem on the index backend project. If that's what you mean, I would try to remove that deadlock, if possible, instead of adding a rate limit (which we could need anyway to avoid denial of service attacks). I mean,

I would try to remove the deadlock to increase the performance. The number of legitimate requests the app can handle.
I would add the rate limit per IP in order to avoid denial of service attacks. But that could be a problem in terms of privacy.

Does that make sense for you @Power2All ?

Hey @da2ce7 @WarmBeer what do you think?

Extra

If you are considering changing your fork back to actix-web and that's a big effort for you I would recommend you to reconsider forking the current tracker version since now:

The tracker is decoupled from the web framework.
We have good documentation.
We have a good test coverage.

We could even:

Make the web framework a plugin.
Extract some parts as independent packages so you can build your own tracker, but we can share some packages. We have been discussing how to make the tracker extensible. If we create some independent packages, maybe you can just reuse the core packages and change your delivery layer.

Power2All · 2023-05-15T18:33:59Z

They are discussing how to handle this ambiguity: tower-rs/tower-http#300.
Is that what you mean "half-tested"?

That discussion was based on my investigation of some high unusual tokio threads.
The solution was given here, you could use it, but it's pretty much a test code, and not fully tested out.

use anyhow::anyhow;
use axum::{routing::get, Router};
use axum_server::{accept::Accept, Handle};
use futures_util::{ready, Future};
use http_body::Body;
use hyper::{client::conn::handshake, Request, Response};
use pin_project_lite::pin_project;
use std::{
    future::Ready,
    io::ErrorKind,
    net::SocketAddr,
    pin::Pin,
    task::{Context, Poll},
    time::Duration,
};
use tokio::{
    io::{AsyncRead, AsyncWrite, ReadBuf},
    net::TcpStream,
    sync::mpsc::{self, UnboundedReceiver, UnboundedSender},
    time::{Instant, Sleep},
};
use tower::{Service, ServiceExt};

const TIMEOUT: Duration = Duration::from_secs(5);

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
    let handle = Handle::new();

    tokio::spawn(start_server(addr, handle.clone()));

    handle
        .listening()
        .await
        .ok_or_else(|| anyhow!("server failed to bind"))?;

    let stream = TcpStream::connect(addr).await?;
    let (mut send_request, connection) = handshake(stream).await?;

    let conn_task = tokio::spawn(connection);

    let request = Request::new(hyper::Body::empty());

    tokio::time::sleep(Duration::from_secs(3)).await;
    send_request.ready().await?.call(request).await?;

    conn_task.await??;

    println!("Connection closed.");

    Ok(())
}

async fn start_server(addr: SocketAddr, handle: Handle) -> anyhow::Result<()> {
    let app = Router::new().route("/", get(handler));

    axum_server::bind(addr)
        .acceptor(TimeoutAcceptor)
        .handle(handle)
        .serve(app.into_make_service())
        .await?;

    Ok(())
}

async fn handler() {}

#[derive(Clone)]
struct TimeoutAcceptor;

impl<I, S> Accept<I, S> for TimeoutAcceptor {
    type Stream = TimeoutStream<I>;
    type Service = TimeoutService<S>;
    type Future = Ready<std::io::Result<(Self::Stream, Self::Service)>>;

    fn accept(&self, stream: I, service: S) -> Self::Future {
        let (tx, rx) = mpsc::unbounded_channel();

        let stream = TimeoutStream::new(stream, TIMEOUT, rx);
        let service = TimeoutService::new(service, tx);

        std::future::ready(Ok((stream, service)))
    }
}

#[derive(Clone)]
struct TimeoutService<S> {
    inner: S,
    sender: UnboundedSender<TimerSignal>,
}

impl<S> TimeoutService<S> {
    fn new(inner: S, sender: UnboundedSender<TimerSignal>) -> Self {
        Self { inner, sender }
    }
}

impl<S, B, Request> Service<Request> for TimeoutService<S>
where
    S: Service<Request, Response = Response<B>>,
{
    type Response = Response<TimeoutBody<B>>;
    type Error = S::Error;
    type Future = TimeoutServiceFuture<S::Future>;

    fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {
        self.inner.poll_ready(cx)
    }

    fn call(&mut self, req: Request) -> Self::Future {
        // send timer wait signal
        let _ = self.sender.send(TimerSignal::Wait);

        TimeoutServiceFuture::new(self.inner.call(req), self.sender.clone())
    }
}

pin_project! {
    struct TimeoutServiceFuture<F> {
        #[pin]
        inner: F,
        sender: Option<UnboundedSender<TimerSignal>>,
    }
}

impl<F> TimeoutServiceFuture<F> {
    fn new(inner: F, sender: UnboundedSender<TimerSignal>) -> Self {
        Self {
            inner,
            sender: Some(sender),
        }
    }
}

impl<F, B, E> Future for TimeoutServiceFuture<F>
where
    F: Future<Output = Result<Response<B>, E>>,
{
    type Output = Result<Response<TimeoutBody<B>>, E>;

    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        let this = self.project();
        this.inner.poll(cx).map(|result| {
            result.map(|response| {
                response.map(|body| {
                    TimeoutBody::new(body, this.sender.take().expect("future polled after ready"))
                })
            })
        })
    }
}

enum TimerSignal {
    Wait,
    Reset,
}

pin_project! {
    struct TimeoutBody<B> {
        #[pin]
        inner: B,
        sender: UnboundedSender<TimerSignal>,
    }
}

impl<B> TimeoutBody<B> {
    fn new(inner: B, sender: UnboundedSender<TimerSignal>) -> Self {
        Self { inner, sender }
    }
}

impl<B: Body> Body for TimeoutBody<B> {
    type Data = B::Data;
    type Error = B::Error;

    fn poll_data(
        self: Pin<&mut Self>,
        cx: &mut Context<'_>,
    ) -> Poll<Option<Result<Self::Data, Self::Error>>> {
        let this = self.project();
        let option = ready!(this.inner.poll_data(cx));

        if option.is_none() {
            let _ = this.sender.send(TimerSignal::Reset);
        }

        Poll::Ready(option)
    }

    fn poll_trailers(
        self: Pin<&mut Self>,
        cx: &mut Context<'_>,
    ) -> Poll<Result<Option<hyper::HeaderMap>, Self::Error>> {
        self.project().inner.poll_trailers(cx)
    }

    fn is_end_stream(&self) -> bool {
        let is_end_stream = self.inner.is_end_stream();

        if is_end_stream {
            let _ = self.sender.send(TimerSignal::Reset);
        }

        is_end_stream
    }

    fn size_hint(&self) -> http_body::SizeHint {
        self.inner.size_hint()
    }
}

struct TimeoutStream<IO> {
    inner: IO,
    // hyper requires unpin
    sleep: Pin<Box<Sleep>>,
    duration: Duration,
    waiting: bool,
    receiver: UnboundedReceiver<TimerSignal>,
    finished: bool,
}

impl<IO> TimeoutStream<IO> {
    fn new(inner: IO, duration: Duration, receiver: UnboundedReceiver<TimerSignal>) -> Self {
        Self {
            inner,
            sleep: Box::pin(tokio::time::sleep(duration)),
            duration,
            waiting: false,
            receiver,
            finished: false,
        }
    }
}

impl<IO: AsyncRead + Unpin> AsyncRead for TimeoutStream<IO> {
    fn poll_read(
        mut self: Pin<&mut Self>,
        cx: &mut Context<'_>,
        buf: &mut ReadBuf<'_>,
    ) -> Poll<std::io::Result<()>> {
        if !self.finished {
            match Pin::new(&mut self.receiver).poll_recv(cx) {
                // reset the timer
                Poll::Ready(Some(TimerSignal::Reset)) => {
                    self.waiting = false;

                    let deadline = Instant::now() + self.duration;
                    self.sleep.as_mut().reset(deadline);
                }
                // enter waiting mode (for response body last chunk)
                Poll::Ready(Some(TimerSignal::Wait)) => self.waiting = true,
                Poll::Ready(None) => self.finished = true,
                Poll::Pending => (),
            }
        }

        if !self.waiting {
            // return error if timer is elapsed
            if let Poll::Ready(()) = self.sleep.as_mut().poll(cx) {
                return Poll::Ready(Err(std::io::Error::new(
                    ErrorKind::TimedOut,
                    "request header read timed out",
                )));
            }
        }

        Pin::new(&mut self.inner).poll_read(cx, buf)
    }
}

impl<IO: AsyncWrite + Unpin> AsyncWrite for TimeoutStream<IO> {
    fn poll_write(
        mut self: Pin<&mut Self>,
        cx: &mut Context<'_>,
        buf: &[u8],
    ) -> Poll<std::io::Result<usize>> {
        Pin::new(&mut self.inner).poll_write(cx, buf)
    }

    fn poll_flush(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<std::io::Result<()>> {
        Pin::new(&mut self.inner).poll_flush(cx)
    }

    fn poll_shutdown(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<std::io::Result<()>> {
        Pin::new(&mut self.inner).poll_shutdown(cx)
    }

    fn poll_write_vectored(
        mut self: Pin<&mut Self>,
        cx: &mut Context<'_>,
        bufs: &[std::io::IoSlice<'_>],
    ) -> Poll<Result<usize, std::io::Error>> {
        Pin::new(&mut self.inner).poll_write_vectored(cx, bufs)
    }

    fn is_write_vectored(&self) -> bool {
        self.inner.is_write_vectored()
    }
}

Power2All · 2023-05-15T18:43:19Z

In our particular case, I think that's not a problem because we do not have a proxy. The tracker does not use a proxy. It could be the opposite, the tracker could be behind a proxy.

Doesn't really matter, normally the proxy end would disconnect it after a certain time, however, if it in some reason doesn't, the connection will be keeping in use, for example, keep-alive could be such a issue.

I would try to remove the deadlock to increase the performance. The number of legitimate requests the app can handle.

I've not fully figured out why it creates deadlocks, I also tried to figure it out using lockbud to scan for potential deadlocks. However, I'm pretty sure it has to do with over-saturating the read/write locks, until to the point that it's overwhelming and killing the system. An HTTP/HTTPS/UDP rate limiter should be able to tame the overwhelming amount of connections. Currently testing it on Actix, and so far it seems to fare, but I will let you know after I've tested it thoroughly. Keep in mind, I've been having 8000 connections average per second on HTTP/HTTPS, and 3000+ connections per second on UDP, so that's roughly 12k connections per second, excluding the obvious burst connections, so a rate limiter would be a very, very good idea ;)

I would add the rate limit per IP in order to avoid denial of service attacks. But that could be a problem in terms of privacy.

It's a overall issue, the RwLock is just being overwhelmed with the amount of incoming connections to deal with, doesn't matter if it's a single IP, I want RwLock to be able to keep up with the resources it has. Tokio RwLock eventually has it's limit, and I reach it pretty quickly in the midday, then the most connections are coming in.

About your extra, the thing is, Rust is still something I'm learning, and I find out every time something new, and using my project as a playground so to speak. I've looked into your code, and it's going to be a lot of work to make it optimal, since I'm still not happy how the SQLx is performing (need less memory overhead, I know where I do it wrong, but haven't gotten around to improve it with the connectivity issues right now). I've also been pondering about replacing the whole RwLock for a fullblown Crossbeam Channel system, which does have a lock mechanism that doesn't use any CPU cycles, and might be a real big performance patch-up.

I like the idea, but I need to get used to Torrust-Tracker, if I want to be useful, since a lot of the things are scattered in folders and files, and makes it a hassle to figure out where what is resided in.

josecelano · 2023-05-16T11:28:55Z

Hi @Power2All thank you for taking the time to reply. Your feedback is beneficial for me and the project. I guess you are the only kwon contributor who is actually using the tracker (your forked tracker, which has quite diverged) on heavy load conditions.

First of all, I have to say I'm also a newbie to Rust, BitTorrent and "systems" programming.

Clarifications for the previous comment

When I said there was not a problem with the actual timeout middleware implementation from Axum (because we do not use a proxy), I meant only for the response status code 408. For me, the HTTP 408 response status code is OK in our context. But I see that we have to solve the problem anyway if the tracker does not close connections that have not been used for a long time by the client.

Deadlocks

Id this the tool you are using?

I'm interested in knowing a concrete case where you could potentially replace RwLock with Crossbeam Channel system. By the way, would this be the package to use?

Intentional vs unintentional denial-of-service

I just want to clarify how I see the problem of saturating the system.
We should try to avoid the system reaching a saturated state. The reason for reaching that state could be:

Spike in popularity or use
Deliberate attack by a single individual or group of individuals

You seem to be focused on the "unintentional" ones, especially on deadlocks.
I have seen two critical points:

Deadlocks. There are some "todos" to reduce the time for the lock, etcetera.
Memory consumption. I'm particularly concerned about how we handle memory. We use memory intensively to make things faster, but I have not seen any mechanism to avoid errors when you can not allocate more memory. Regarding adding a usage limit, I think, in this case, it's easier to limit the usage because the structs we are using usually allow defining the capacity. We have not yet tested what happens when you reach that limit. But not sure about this, maybe @WarmBeer knows more.
Other bottlenecks. I opened this discussion about profiling because I think we could be underestimating other potential performance problems unrelated to deadlocks or memory consumption. With a profiling tool, we could detect other bottlenecks.

Regarding "intentional" DoS attacks, I'm not a security expert, but I suppose for those cases, you should try to block only the attacker and not only try to avoid the system to reach a saturated state. The only method I know it's banning an IP if it goes over a request rate limit. It would be fun to explore other mechanisms like Proof-of-Work, for example, Hashcash. What do you think @da2ce7?

Architecture and performance

I'm sorry to hear that "a lot of the things are scattered in folders and files". That's probably my fault.

If that helps, I've opened a new discussion about the:

New app architecture

I think performance is a key aspect of this kind of software, but I would like to have other things like:

Maintainability
Extensibility
Flexibility
Robustness (automated testing is easier)

Maybe a more transactional architecture could improve the performance, but in my opinion, it is going to be harder to get those attributes. I'm willing to pay a 5% performance loss to get other things. As I see it, even if your application is really good at performance, at some point, you are going to reach the machine limit. For me, bad performance means:

You have to pay more for bigger instances or more instances.
You need to scale to more than one instance early.

The second case could make the deployment harder, leading to the loss of users because of the complexity of running the app.

The problem is we do not know which kind of users are interested in running the tracker:

Personal and private use, supposedly using only one server.
Other organizations
Others

Power2All · 2023-05-16T12:22:55Z

When I said there was not a problem with the actual timeout middleware implementation from Axum (because we do not use a proxy), I meant only for the response status code 408. For me, the HTTP 408 response status code is OK in our context. But I see that we have to solve the problem anyway if the tracker does not close connections that have not been used for a long time by the client.

No problem, I knew what you mean, but you need to take into account, with the current time, security is a must, and exploiters and hackers will try anything to break the code, so security is a must have right now, and a open to be exploited DoS factor is a no-no in my books ;)

Id this the tool you are using?

I'm interested in knowing a concrete case where you could potentially replace RwLock with Crossbeam Channel system. By the way, would this be the package to use?

Yes, and yes. I'm going to work on a simple separate .rs file that contains the channel handler. It works almost the same like with GoLang where I implemented it the same way.
You create both a sender and a receiver (with crossbeam you can clone both, mpsc you cannot), one thread waits for receiving data as a String (that's how crossbeam works), which could be serialized encoded struct or whatever you want to send, the other end waits for data (acts like a mutex, it waits. Without using any mutex, creates no cpu cycles).
When it receives data, you can convert it back into a struct or do with it whatever you want (could be dynamic JSON data whatever you want), and then return with the same channel back a response, or just a message that it's done, where both sides wait for the data to be handled.
This way, you have a Read/Write handler through a simple thread channel, without the need for any RwLock or any Mutex whatsoever. It works and scales very well, as I've tested it in Go before.

You seem to be focused on the "unintentional" ones, especially on deadlocks.

Both, first was I noticed the Axum issue that it didn't have a correctly connection, read, write and disconnect timeout implement. Spoke to that david guy from the tokio development team, and it seems nobody was aware this would be such a issue, so they got some pressure on it now as well, since other users chimed in about it too.

Deadlocks. There are some "todos" to reduce the time for the lock, etcetera.

Those can be alleviated using Crossbeam Channel as replacement for all the RwLock Arc's, but that's going to be some work on my end as well, since I've done it on Go pretty well, Rust is another one that makes it a nice and interesting side project.

Memory consumption. I'm particularly concerned about how we handle memory. We use memory intensively to make things faster, but I have not seen any mechanism to avoid errors when you can not allocate more memory. Regarding adding a usage limit, I think, in this case, it's easier to limit the usage because the structs we are using usually allow defining the capacity. We have not yet tested what happens when you reach that limit. But not sure about this, maybe @WarmBeer knows more.

It's the best to have everything dealt with in a single thread, since moving pointers between threads is a hassle pretty much. With Crossbeam Channels, instead of moving around a shared big ass BTreeMap and such, you simply throw bits between threads to the main thread that handles it, which was my main focus on the former Go app I made.

Other bottlenecks. I opened #211 about profiling because I think we could be underestimating other potential performance problems unrelated to deadlocks or memory consumption. With a profiling tool, we could detect other bottlenecks.

I've not figured out yet to profile correctly within CLion yet, but it would be pretty neat to see memory consumption and whatnot shown on-the-fly, just like C# in Visual Studio has. If you know something about that, let me know :)

Regarding "intentional" DoS attacks, I'm not a security expert, but I suppose for those cases, you should try to block only the attacker and not only try to avoid the system to reach a saturated state. The only method I know it's banning an IP if it goes over a request rate limit. It would be fun to explore other mechanisms like Proof-of-Work, for example, Hashcash. What do you think @da2ce7?

Trust me, I've dealt with a shit ton of DDoS in the past when I was hosting and administrating a controversial IRC network, so I know what kinds of stuff could be thrown at it, including DNS reflection attacks (which is a bitch to deal with).

I'm sorry to hear that "a lot of the things are scattered in folders and files". That's probably my fault.

It was just a observation. If there was a logical reason for these scattering, and you could explain me the logic behind it, I could get used to it, but right now I like less scattered structures, like for example I had to get used to PHP Symfony structure, but now I know why it's like that and love it.

Maybe a more transactional architecture could improve the performance, but in my opinion, it is going to be harder to get those attributes. I'm willing to pay a 5% performance loss to get other things. As I see it, even if your application is really good at performance, at some point, you are going to reach the machine limit

I like the way you explain things, so it's pretty awesome to respond to these messages, hence why I also helped WarmBeer, awesome dude.
I never thought about the limitations of the machine, nothing is a limit, and I will mostly, at least, so far, find a way when people say to me it's impossible.
If OpenTracker in C can deal with this amount of data, then so should Rust or C# :)

The problem is we do not know which kind of users are interested in running the tracker.

I see it more like a general purpose server, it could be used for public trackers, private tracker, or semi private that also monitors user accounts bandwidth usage and such (which I'm working on for version 3.2.0).

If you are on Discord, you can add me through this handler, so we could have some more indept discussion if you like:
Power2All#1482

Cheers

josecelano · 2024-04-17T17:28:25Z

I'm trying to fix this problem in Axum, but I can't find a patch. I have not tried the solution proposed by @Power2All yet. I've been waiting to see if axum-server crate responds to my issue.

I thought it was solved after the migration to hyper 1.0, but I only see an option to set a timeout for sending the request headers (header_read_timeout).

https://github.com/josecelano/axum-server-timeout

I can also confirm that it works in ActixWeb. I've added three web frameworks in the example: Axum, ActixWeb and Rocket. It seems that it only works in ActixWeb.

Power2All · 2024-04-17T18:09:27Z

I'm trying to fix this problem in Axum, but I can't find a patch. I have not tried the solution proposed by @Power2All yet. I've been waiting to see if axum-server crate responds to my issue.

I thought it was solved after the migration to hyper 1.0, but I only see an option to set a timeout for sending the request headers (header_read_timeout).

https://github.com/josecelano/axum-server-timeout

I can also confirm that it works in ActixWeb. I've added three web frameworks in the example: Axum, ActixWeb and Rocket. It seems that it only works in ActixWeb.

That's why I moved away from Axum and went with Actix for quiet a while, with good progress ;)

josecelano · 2024-04-18T07:20:10Z

I'm you are using a proxy (Nginx or Apache) it seems you can mitigate this attack with the proxy configuration:

Slowloris DoS Attack

Power2All · 2024-04-18T07:24:54Z

I'm you are using a proxy (Nginx or Apache) it seems you can mitigate this attack with the proxy configuration:

Slowloris DoS Attack

* https://www.nginx.com/blog/mitigating-ddos-attacks-with-nginx-and-nginx-plus/

* https://blog.imkhoi.com/posts/2023/10/slowloris-ddos-and-how-to-mitigate-with-nginx/

* https://hexadix.com/slowloris-dos-attack-mitigation-nginx-web-server/

* https://www.acunetix.com/blog/articles/slow-http-dos-attacks-mitigate-apache-http-server/

Correct, but that's just a band-aid.

josecelano · 2024-04-18T15:35:23Z

I've updated the example repo with the patch. I had to update the patch because the Body trait has changed. It works partially because it closes the connection but it does not return a 408 Request Timeout response.

josecelano · 2024-05-15T15:05:31Z

Hi @Power2All, I've applied the patch (from @programatik29) you mentioned in the two Axum servers we are using:

However, It does not send a 408 Request Timeout to the clients like ActixWeb. But it seems not all servers do it:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/408

I'm going to close this, but feel free to re-open if you think we are still vulnerable to this attack.

josecelano · 2024-07-05T10:28:15Z

The patch does not work when TSL is enabled.

josecelano · 2024-07-05T13:14:02Z

The new hyer release 1.4.0 changes the http1_header_read_timeout.

We are already using this new version, but it seems it has not fixed the problem. When you run the tracker and connect to the API via telnet without making any request, the connection is not closed.

How to reproduce

Run the tracker. Make sure the API is enabled.

cargo run

Connect to the API:

telnet 127.0.0.1 1212

Hyper 1.4.0

Although it seems it's not working with Axum, I did a test using directly hyper, and it works.

You can use the echo example in the Hyper repo:

https://github.com/hyperium/hyper/blob/0eb1b6cf4d914ce9c3f8e92a8b43754eba27a327/examples/echo.rs

If you run the example with:

cargo run --features="full" --example echo

And then you run:

$ telnet localhost 3000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Connection closed by foreign host.

You need to set the timeout in the main function with .header_read_timeout(Duration::from_secs(5)) otherwise it would use the default value (30 seconds):

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));

    let listener = TcpListener::bind(addr).await?;
    println!("Listening on http://{}", addr);
    loop {
        let (stream, _) = listener.accept().await?;
        let io = TokioIo::new(stream);

        tokio::task::spawn(async move {
            if let Err(err) = http1::Builder::new()
                .timer(TokioTimer)
                .header_read_timeout(Duration::from_secs(5))
                .serve_connection(io, service_fn(echo))
                .await
            {
                println!("Error serving connection: {:?}", err);
            }
        });
    }
}

Power2All · 2024-07-08T11:37:17Z

Yah, I left Axum for a while, and Actix v4 is doing everything I need, stable and fast.
Just as a hint, my tracker deals now with 44.2 billion requests per month.
Completely free from mutex/rwlock, since they are too CPU intensive.

josecelano mentioned this issue May 16, 2023

Migration from actix-web to Axum torrust/torrust-index#156

Closed

12 tasks

josecelano added the Security Publicly Connected to Security label May 16, 2023

josecelano mentioned this issue Jun 15, 2023

API: Add timeouts for client requests to avoid DoS attack torrust/torrust-index#204

Open

da2ce7 added - Admin - Enjoyable to Install and Setup our Software Needs Feedback What dose the Community Think? Needs Research We Need to Know More About This labels Oct 10, 2023

cgbosse added this to Torrust Solution Jan 8, 2024

cgbosse moved this to BUG in Torrust Solution Jan 8, 2024

cgbosse moved this from BUG & Security to Help Wanted in Torrust Solution Jan 11, 2024

josecelano mentioned this issue Jan 12, 2024

Review missing timeouts #603

Open

9 tasks

cgbosse added this to the v3.0.0 milestone Jan 16, 2024

josecelano mentioned this issue Apr 17, 2024

How to set up a timeout for TLS hasdshake programatik29/axum-server#116

Open

josecelano mentioned this issue Apr 18, 2024

Mitigating Slowloris DoS Attack with Nginx torrust/torrust-compose#10

Open

josecelano self-assigned this May 15, 2024

josecelano closed this as completed May 15, 2024

github-project-automation bot moved this from Help Wanted to Done in Torrust Solution May 15, 2024

josecelano reopened this Jul 5, 2024

This was referenced Jul 5, 2024

Tracker API: Add a timeout to the TLS handshake, waiting for headers, and waiting for receiving headers #612

Open

HTTP Tracker: Add a timeout to the TLS handshake, waiting for headers, and waiting for receiving headers #613

Open

josecelano modified the milestones: v3.0.0, v3.1.0 Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Denial of Service attack factor #324

Denial of Service attack factor #324

Power2All commented May 12, 2023

josecelano commented May 15, 2023 •

edited

Loading

Power2All commented May 15, 2023 •

edited

Loading

josecelano commented May 15, 2023 •

edited

Loading

Power2All commented May 15, 2023 •

edited

Loading

Power2All commented May 15, 2023

josecelano commented May 16, 2023

Power2All commented May 16, 2023 •

edited

Loading

josecelano commented Apr 17, 2024

Power2All commented Apr 17, 2024

josecelano commented Apr 18, 2024

Power2All commented Apr 18, 2024

Slowloris DoS Attack

josecelano commented Apr 18, 2024

josecelano commented May 15, 2024

josecelano commented Jul 5, 2024 •

edited

Loading

josecelano commented Jul 5, 2024

Power2All commented Jul 8, 2024

Denial of Service attack factor #324

Denial of Service attack factor #324

Comments

Power2All commented May 12, 2023

josecelano commented May 15, 2023 • edited Loading

Power2All commented May 15, 2023 • edited Loading

josecelano commented May 15, 2023 • edited Loading

Timeout

Rate limit

RwLock

Extra

Power2All commented May 15, 2023 • edited Loading

Power2All commented May 15, 2023

josecelano commented May 16, 2023

Clarifications for the previous comment

Deadlocks

Intentional vs unintentional denial-of-service

Architecture and performance

Power2All commented May 16, 2023 • edited Loading

josecelano commented Apr 17, 2024

Power2All commented Apr 17, 2024

josecelano commented Apr 18, 2024

Slowloris DoS Attack

Power2All commented Apr 18, 2024

Slowloris DoS Attack

josecelano commented Apr 18, 2024

josecelano commented May 15, 2024

josecelano commented Jul 5, 2024 • edited Loading

josecelano commented Jul 5, 2024

How to reproduce

Hyper 1.4.0

Power2All commented Jul 8, 2024

josecelano commented May 15, 2023 •

edited

Loading

Power2All commented May 15, 2023 •

edited

Loading

josecelano commented May 15, 2023 •

edited

Loading

Power2All commented May 15, 2023 •

edited

Loading

Power2All commented May 16, 2023 •

edited

Loading

josecelano commented Jul 5, 2024 •

edited

Loading