This document went through the details of the implementation of a high-throughput HTTP server by creating a new highly-scalable TCP/IP network stack.
While this solution has managed to improve performances by more than 240% as compared to a solution using the Linux network stack, it failed to achieve the initially targeted goal of fulfilling four 10 Gbps links. Performances could be further improved by carrying out some optimisations that have not been put in place.
The development of the Rusty network stack took much more time than expected
(significantly more than one thousand hours). In particular, the complexity of
the TCP protocol was vastly undervalued. An efficient implementation of the
protocol could actually reach far more states than the eleven usually defined
(LISTEN, SYN-SENT, ESTABLISHED ...). As an example, it is specified that
the protocol must move, when the application layer asks to close a connection
(i.e. by calling the close()
function), from the ESTABLISHED state into the
FIN-WAIT-1 state while
sending a FIN segment. But what if the transmission queue is not empty ? In
this case, the FIN segment can not be delivered immediately and the state
machine is actually in a state between those two, as it acts as if it was in the
FIN-WAIT-1 state for the application layer, but still processes incoming
segments as if it was in the ESTABLISHED state. And what if a FIN segment is
received while in this not clearly defined state ? In this situation, the state
machine must once again move into another unspecified state.
To decision of designing a reusable and portable network stack, instead of a simpler stack directly integrated into the traffic generator, made the software more interesting (in my opinion), but also more challenging and complex.
In conclusion, while I am very pleased by the final outcome of this project, it took me much more time than what I would have ever plan to spend on a Master thesis. I am proud of the result, but never would I do that a second time.