-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fq_codel network packet scheduler algorithm by default #2203
Conversation
The fq_codel network scheduler is the de-facto standard nowadays in most distros. Systemd enables the scheduler by default if available. Make sure all boards have the necessary kernel module activated.
These changes have worked for my build of the ova target. The PR looks good. |
standard in modern linux distros these days and should have better schedulding capabilities as the previous pfifo_fast scheduler. (cf. home-assistant/operating-system#2203)
The fq_codel network scheduler is the de-facto standard nowadays in most distros. Systemd enables the scheduler by default if available. Make sure all boards have the necessary kernel module activated.
It kinda makes sense to use sch_cake instead, as it is not limited to making individual connections fair. Sch_cake can instead be instructed to group first the destination hosts, allowing fair bandwidth between them and then group the connections of each host and make those fair between each other. This way hosts which open multiple connections and doing a transfer are not given more bandwidth than hosts opening just a single connection. |
Not against chainging scheduler algorithm, but I'd prefer if we pick one which has at least some major distribution is using by default, to have some real world testing... 🤷♂️ It seems that even OpenWrt still defaults to |
fq_codel is enough for HA OS. sch_cake is likely to be far too heavy for smaller and slower systems. |
@agners wrote:
OpenWRT defaults follow a minimal-flash-usage approach, so no GUI for shaping by default.
OpenWRT's GUI module to help users configure shaping sets up So in situations where fair queuing makes sense, OpenWRT uses @agners wrote:
Distributions will probably never default to But without deactivating hardware offloading fair queuing can't work as expected – that's why |
@unclehack wrote:
I'm using sch_cake for shaping my internet connection on my TP-Link C2600 which got an Qualcomm Atheros IPQ8064@1.4 GHz. It runs DHCP, DNS and sch_cake. Load is sch_cake is really lightweight, there's hardly any difference to fq_codel and IME it's lighter than htb+fq_codel if you need shaping below the link-bandwidth. |
If you choose to use EthernetI would recommend this settings: 'regional diffserv8 ethernet nat'.
WifiOn Wifi cards using Virtual interfacesThe virtual interfaces going to the add-ons do not need any kind of special handling, as the bandwidth is much higher than the hardware links on connections. Btw: Diffserv/QoSIt makes sense to set a higher tag on traffic by the HTTP server serving the HA GUI, to get priority by |
thx for summoning me! A lot of the advice we've given out about cake is old - we developed it starting in 2014... on a 600Mhz single core mips processor which could barely crack 100Mbits, shaped, and these days people are regularly pumping 20Gbit or more through 10k instances of it (see libreqos.io for shining example). The luci-app-sqm tool (and the linux-compatible sqm-scripts) are now very overcomplicated if you just want to run cake, but it's established, and thus that's what we use. I wish we'd expose the "dangerous options" now in openwrt at least - as they aren't dangerous anymore, and make the default always be nat, since that's the most common problem we see in the field. fq_codel and cake are very lightweight when used at line rate, with BQL-enabled ethernet, which is nearly all ethernet cards today. I run cake on everything. Yes, the default gso-splitting mechanism becomes a throughput limitation at 10Gbit but even with 2.5Gbit it's keeping up on things like the R6S, and I care more about low latency, all the time, than anything else. 42 packet GSO burps from IW10 enabled TCPs (nearly all of them) invoke a lot of jitter. One of my sadnesses is we've never made cake multicore, using it with sch_mq (which is increasingly the top level default), means you get X instances of it, when just one cakemq (especially while shaping) would be more effective. I have been trying to get folk to standardize on diffserv4, as that treats the diffserv bits as compatibly as all the semi-conflicting diffserv standards treat them. Notably it's the closest thing we have to have wifi maps things, as well as zoom's webrtc recommendations. I wish we'd not made the diffserv8 setting available at all, as the mechanisms have been depreciated since 2003 (tho still common). diffserv3 is still an ok default. Please don't use diffserv8. No, you shouldn't use cake on the wifi unless it's bloated, and possibly not even then. Multiple wifi chips today already have a native implementation of fq_codel in them, (look for an aqm file in /sys/kernel/debug/iee*/phy*/aqm ) and the principal intent of the codel algorithm was to dynamically adjust the buffering to the bandwidth, which varies a lot based on the distance from the AP. See the CDF plot here: https://blog.cerowrt.org/post/real_results/ or the paper here: https://www.cs.kau.se/tohojo/airtime-fairness/ - I urge everyone to think that "shaping wifi" is an answer to read that... (It's very frustrating that folk want to shape wifi, which, with movement of a millimeter or two, can have a 10x1 different bandwidth/buffering issue). someday more of cake will move into the wifi drivers... I wish we'd left nat on as the default in cake. Ideally it should "just figure it out", but leaving it on eliminates a major mistake anyone with nat in that it makes the per host/per flow fq "just work" for both ipv4 and ipv6. The overhead of having the nat option "on" on a non-natted machine is below 2%. As for the regional setting vs the internet setting... I don't know. I think we should have had a "continental" setting, closer to about 70ms. I see folk using regional especially when shaping in front of vpns. So in short, my default would be
Which will apply it on all ethernet interfaces, mq or not, at line rate, and you won't notice it's there on most hardware. Cake works exactly the same shaped, or unshaped.
I actually wouldn't mark your webserver traffic at all except for things that really needed low latency (voip, gaming), or should run in the background (the LE codepoint). See also, qosify. I wish we'd made this less complicated over time, but I hope this helps. |
To clarify: cake or fq_codel at line rate do NOT turn off hardware offloading, so it is safe to use those at line rate, even though it tends to not be as effective, as I just tried to unpack, above. GSO-splitting is a response to misguided (I'm opininated) attempts by ethernet device makers to make single threaded iperf benchmarks look better, by bulking up packets into a GRO superpacket... and has nothing to do with other hardware offloads, so it's safe to have on all the time. In addition to doing better FQ, it also makes the "codel" portion of the aqm algorithm work better, as it was designed to work against single packets not (up to) 42... I often wish we'd put gso-splitting into fq_codel also (I actually have a version that does that, but never submitted it)! Watching GSO "burp" the latency, especially below 300Mbits, is no fun... Anyway, another big motivation for GSO/GRO/TSO has faded, in that it was also developed to compensate for routing table lookups in linux being so slow prior to linux 4.2 or so, a single GRO packet only needs one routing table lookup. By the time it hits cake, that routing table lookup is already done... Also, in the real world, on real traffic, GSO superpackets are rarely seen, and honestly if we could just rip out all that extra code doing that work, the devices would get faster in the first place... (not the case for TSO, where the ethernet card does the work), at least in the sub 2.5Gbit markets. the sqm-scripts and luci-app-sqm attempt to turn hardware offloads off (and don't always succeed) when creating an instance of cake, shaped. |
Hey @dtaht, thanks for your recommendations!
Well, it kinda is: Webserver here ships the web interface once and then pushes updates in real time to it. The latency here is important to make it feel snappy. It should also get priority bandwidth wise over other things which may run in the background. As an example, things running here on my setup which do use upload bandwidth:
|
In general I prefer deprioritizing to prioritizing. I'd deprioritize the latter two using the LE codepoint. As for the vpn server, it depends on the vpn. Kernel wireguard and ipsec "do the right things" with cake or fq_codel in the loop. userspace vpn does not (which is why I keep seeing folk put yet another cake shaped instance in front of it, often with lower default rtt settings) |
There are other tunings for "snappy". Notably TCP_NOTSENT_LOWAT, and depending on the structure of the application, tcp_bbr - only useful for longrunning +10sec flows. |
Isn't this discussion slowly getting too academic? IMHO using fq_codel is perfectly fine and also perfectly in line with other linux distros as @agners pointed out. And just hunting for the last few percentages for improvement might be just a bit too much, especially considering the target OS/application and this is HomeAssistant and not a high throughput or low latency requiring application which might justify such long discussions on that topic. |
fq_codel has been used in many environments, including IoT. Packets of the same flow can experience reordering caused by the 8 way associative hashing when using cake. This isn't something I'd want in an IoT oriented system. Many devices have really low amounts of RAM and very simple network stacks. There other improvements which can be made to an IoT oriented Linux distribution such as HA OS. These include stability improvements, the implementation of relevant features, disk IO performance improvements, file system optimizations, memory usage improvements, reducing SSD/microSD wear and many others. fq_codel has already improved HA OS' network latency. Users who need extremely low latency for everything are probably not using the right OS. They probably want to set up the HA OS container image on their Linux distribution with the deeply customized networking configuration they desire. HA OS is meant to be used as it is by most users. |
Not true: Packets of the same flow can experience reordering caused by the 8 way associative hashing when using cake. Doesn't happen with fq_codel either. I'm cool with y'all sticking with fq_codel, btw. Principal benefit to cake was in deprioritizing the flows I mentioned. |
sch_cake attempts to reduce hash collisions by using the 8 way associative hashing. This is good in general. There's a particular scenario which can cause reordering. The packets of a flow can hash to another Cake flow if a collision has led to an overwrite of the tag on the previous cake flow. Doesn't this lead to the reordering of packets, depending on the order in which the two Cake queues with packets of the same flow are serviced? Packets are still stored in the same order they arrived in the two Cake queues. |
It is incredibly hard to create that scenario. we actually check for it to some extent with the way_cols statistic. Flows collide rarely enough in the first place (at 10Gbit, I think you can have 400 full size packets outstanding, typically, and there are 1024 queues in the besteffort queue scheme). Checking the biggest libreqos.io installation we have (10k subs, a week's wroth of data), only 63 had any way_cols at at all (and that's the first thing that has to happen before oo could possibly happen), the biggest one had only .1% way_cols relative to packets. You are right in that fq_codel won't ever have this behavior, but the odds of it happening even once to deliver out of order packets, in cake, are astronomical, and rapidly compensated for well within a few packet deliveries. Someone could write a pretty good paper on making this pathology happen, I think, (stressing a ton of small packets, perhaps simulating 1k+ voip calls), or, say, 40Gbit of bandwidth flowing through a single instance of cake. We are aware that both fq_codel and cake seem to need more queues at > 10Gbit, and that's usually the case, 64 hardware queues are common, and 64 instances (which is way, way, too much, IMHO) I kind of judge the "birthday problem" cake solves vs a vs fq_codel more important than the possible re-ordering problem. |
The expectation was that with cakemq (which remains unwritten) at the cpe and isp head ends, the per-host/per flow fq of cake would take off, and the diffserv treatments for videoconferencing also. As for this application, don't know. I am happy y'all adopted fq_codel at least. More work on the underlying transports might help, I already mentioned TCP_NOTSENT_LOWAT |
The fq_codel network scheduler is the de-facto standard nowadays in most distros. Systemd enables the scheduler by default if available. Make sure all boards have the necessary kernel module activated.