An awful lot of home broadband connections suffer from bufferbloat and even for the ones that don't, a single host can easily hog all the bandwidth.
If you're used to getting lag in your VoIP or gaming when a house mate starts a stream/download/torrent, this can be fixed :)
The cake traffic shaper in OpenWRT is amazing for fighting bufferbloat in your home network and it can also do almost perfect fairness in dividing the available bandwidth per LAN host with very little configuration. Just get it as part of the SQM tools in OpenWRT and enable it. For the per-host-fairness take a look at the "Make cake sing and dance" from this link: https://openwrt.org/docs/guide-user/network/traffic-shaping/...
Parents live in a smaller town with two awful ISP selections. They had a bunch of WiFi devices on an ISP router and the connection quality and latency was just terrible when more than one device was in use and any bandwidth intensive services were being used. (Low quality Netflix is intensive on small-town monopoly internet.)
I purchased them a Netgear R7800 and installed hnyman's LEDE build [1] to enable SQM. Night and day difference in latency response. No more staring at a white screen for 3 seconds per URL click.
The build has been stable for several months. I wouldn't recommend this for non-technical users or anyone not willing to spend time troubleshooting, but it has been a great improvement. I couldn't find any other device capable of doing this without running x86 hardware or something else silly.
A few other people mention it, but yes, this is only going to work on slower connections on current SOHO hardware. I think the R7800 can do software SQM at up to 150mbps or so. Plus, if you have a gigabit symmetric connection, hopefully you aren't having bufferbloat issues.
Just wish a popular manufacturer would release an easy-to-use router with SQM so I could install it for non-technical users and forget. Ubiquiti is somewhat close to that, but I believe their prosumer hardware (USG) is running a slow processor at the moment and doesn't even support SQM without installing custom kernels.
>I couldn't find any other device capable of doing this without running x86 hardware or something else silly.
Was the internet speed so high that you couldn't use a normal supported router like a TP-Link Archer C7 at half the cost? I need to do more testing but it seems my C7 can handle my 100/100Mbps fiber connection doing SQM without too much issue.
I haven't seen a router under $400 that can do fair queueing in hardware faster than ~200 megabits. If you have fiber it's cheaper to setup a beefy Linux box and run PFSense on it. Hardware offload is usually disabled when you turn on QoS so doing slow will often slow down gigabit LAN links as well
This seems like a potential sweet spot for the Espressobin [1] with pfsense but the pfsense folks have not released their ARM version, only demonstrated it [2], perhaps they're just too dang busy or perhaps it would cut into the margins of their x86 solutions. Regardless it would make a nice appliance if they ever do release a PFsense ARM image.
PFSense is great if your okay with pulling out a monitor and keyboard every time there is a config issue or interface change. Do not bring in any interfaces over USB if you like to preserve your sanity and want to use PFSense.
These days I just run OpenWRT on x86, no more will my router sit in a broken state that I can't fix by logging in over the LAN or WAN (via OpenVPN ofc). Wish PFSense would get sane defaults in this regard!
I'm at 100/100 though, so wouldn't something simpler be enough? The wired side of the router is gigabit but that's just an integrated gigabit switch, it doesn't even touch any CPU. I'll be doing more testing to make sure but my ISP doesn't seem to have too bad a buffer bloat anyway.
My understanding of these routers is that the gigabit switch is independent from the router. They're physically on the same board but the router is just another machine on the switch. If the switch table says portA->portB it doesn't matter what the router on portC has decided to offload or not.
Edit: Maybe you mean Wifi to wired may have a disabled offload? That path does go through the router and not directly through the switch. For bigger installations I end up having one of these with wifi disabled as the router (firewall, dhcp, etc) and individual ones connected through ethernet as dumb access points (same SSID on all and straight bridge from Wifi to Ethernet). That should also avoid any issues and is a good setup to get more wifi coverage with a simple config.
Good to know, would explain why there's a phantom eth port on some of these routers, must be used to connect between router and switch chips. Sounds like you're right about wifi->wired transit though if this is the casr
The typical architecture for routers these days is that the main SoC has two ethernet interfaces, each of which is connected to a 7+ port managed switch. One of the host CPU's interfaces is on the WAN VLAN, and the other is on the LAN VLAN. Some older routers used to have just one ethernet link between the switch and the CPU, with the CPU's other interface exposed directly as the WAN port. That made it easier to avoid bloat or bugs in the ethernet switch itself, but was fundamentally incompatible with the NAT offload those switches provide, so that configuration is now almost impossible to find.
This also makes these little routers extremely powerful. Since those switches have VLANs as well you can create very interesting topologies that would require much more expensive managed switches to achieve. I run an extra VLAN from my router, to one of my APs, to a dedicated wifi SSID to another wireless router to ethernet to a TV box so I can have the TV signal in a place I can't run ethernet to. Doing it through the normal Wifi would be a bad idea because the provider uses multicast IPTV and if you put that on your wifi every connected devices receives it. And this is all done with 3 50€ routers running LEDE, each with different VLANs and wifi SSIDs configured. They make for a really flexible setup.
Just read the r7800 had the best range for an all-in-one unit. Not sure if it's true, but it has been an amazing router. I picked one up for myself -- they are 130$ refurbished on Amazon every now and then.
To answer your question: I have no idea. Would be neat if a much cheaper model had the horsepower though.
Sounds like a good recommendation. The C7 has been my go-to for cheap, good wifi, and solid LEDE support. But I haven't stress tested it to check how it will take a very congested network. My uses have had fairly light users.
The Archer C7 seems to do about 400Mbps with no configuration/optimization when running OpenWRT, plus with them being available for $20 to $30 on Craigslist and their knockoffs (Offerup & Letgo), its easy to nab one for cheap.
I hear hardware offload is possible, but I have yet to try a build that has the patches for it.
With Gigabit fiber internet I can see needing something more. I find 100Mb internet to be enough for my needs and the Wifi performance to be adequate to the NAS on the LAN. So I even prefer that there's no offload to hardware and that it's the well tested Linux kernel code doing the heavy lifting.
Ran the tests and apparently the C7 is perfectly capable of doing 100Mb/s with cake. But as it turns out I don't really need it. I already get an A for bufferbloat with my provider without it and turning it on doesn't get me to A+.
I've found that I only need to shape upload and I get almost all the bufferbloat benefits, while reducing CPU requirements because download is not shaped. Thus, a $15 router with a slow CPU can be fine for fixing bufferbloat.
Thanks for the Edgerouter link, saved me doing some Googling - Off to got and apply this now and see what difference this makes to the Bufferbloat tests.
As Arie mentions, this is a little more involved on the EdgeOS stuff, but doesn't look too complex for those that are used to a CLI or two.
You get integration with the Unifi UI and a much easier configuration experience, you lose only because what you've got is (probably) overkill for residential use. If you've got more than one AP then you start to win again because you can power all your APs from the switch.
If you were happy with your setup before hearing about the EdgeRouter, I hope you'll still be happy with it now :).
It looks like I'll be fine with the US-8-60W and the USG until I get an Internet (downstream) speed of > 100 megabits, which, in Australia, is not likely to happen for a long time.
I'd hope I'd have upgraded my LAN to at least 2.5 gigabit by then, anyway.
There's one downside - if you've got a fast connection, anything less than a top-of-the-range router with dual or quad core highly clocked CPU's are going to struggle to shape that much traffic.
Great links. I've been using cake for a while but without the advanced options.
One random question: Does it ever make sense to add SQM to a tap_soft interface? I have two locations and both have SQM set up to minimize bufferbloat, but when I VPN from one to the other there is some bufferbloat on the VPN connection.
if the vpn terminates on the routers... fq_codel now (I cannot remember the kernel version, sorry) can preserve the inner hash of the vpn traffic and manage the flows before they hit the tunnel. This is mostly an ipsec, not openvpn, sort of thing.
Recently I ranted at the ISPs and at the network neutrality people. fq_codel (RFC8290) is now nearly ubiquitous in as the default queuing mechanism in most linux distributions, and it is long past time more ISPs supplied it in the gear they give customers.
The problem is that ISP provided kit is built to a price, so will tend to have a more aenemic processor. Shaping high bandwidth needs CPU cycles, or better yet, hardware offloading.
Funnily enough, I've just got a firmware applied to my cable connection with Virgin Media in the UK that mitigates the Puma 6 issue on their provided DOCSIS 3 modem - it's eliminated buffer bloat as well, which is a nice side effect.
People are listening, Docsis 3.1 includes active queue management for instance, which goes a long way to preventing the issue.
Note that the firmware update doesn't fix the issue (which is Intel failed to put much cache on their chips to save $$), but merely helps to lessen the impact of Intel's shoddy hardware engineering. I would still replace your modem with a non-Intel modem ASAP if you want good performance.
I enabled sch_cake on my router (an R7800 with Openwrt 18.06 on it) recently. Worked like a charm. Would happily recommend.
I imagine the main thing that's holding back ISPs from using it on their consumer routers is the (minimal) configuration the user has to do to set to upload bandwidth limit. If ISP routers were also using decent queuing algorithms and not buffering like crazy then that wouldn't be necessary of course, but it doesn't seem like that's going to change any time soon.
Well, many isp-born devices are provided with that info by the ISP at connect time, so the hope has always been that during configuration they'd just pass (for example) "bandwidth 10mbit docsis" to the network setup routine. DSL modems also typically get this info at startup.
Neither HTB or SFQ includes a CoDel-like AQM component. HTB+SFQ will get you fair sharing of bandwidth shaped at the rate of your choosing, but does nothing to control queue lengths.
For directly managing bufferbloat, I never understood this to be a requirement. Either technique is meant to create back-pressure on the TCP stack so it can more effectively manage it's window.. and both do that perfectly well.
My understanding is that you would want AQM in order to keep your bandwidth utilization closer to the actual wire speed than a simple priority queue would.
So, we've had the tools to deal with bufferbloat for a long time, it's just that this mode now just provides slightly better peak performance.
> For directly managing bufferbloat, I never understood this to be a requirement. Either technique is meant to create back-pressure on the TCP stack so it can more effectively manage it's window.. and both do that perfectly well.
If you have a traffic shaper being fed by a deep, dumb queue, you aren't giving useful back-pressure to TCP until it's too late. You need an AQM that gives either ECN marks or packet drops soon enough that you don't build up or sustain deep queues of packets. That's the core of what bufferbloat is. The queue(s) in front of HTB in your router might not be as stupidly oversized as the queue in your cable or DSL modem, but they're still susceptible to the same problems when there's no AQM component. Some TCPs do a decent job of backing off when latency climbs, before the buffers actually fill and start causing packet drops. But AQM in the router can more directly observe and act on congestion, and works with older TCPs and non-TCP traffic.
I am immensely cheered up by the progress reports contributed by so many on this thread. But, can I ask that if you grok it, go fix it for two friends? Go fix it for a local small business, a coffee shop, or a hotel. And ask your ISP to fix it in their default gear?
There's probably over 2b routers without bufferbloat fixes installed, and only if we work together to get them deployed will the internet as a whole get better, more capable of handling web, games, videoconferencing, and other applications that demand consistent low latency.
Thanks for everything that you do, Dave. Your bufferbloat and WiFi work is excellent, and I'm excited for the airtime fairness work with Toke to make it into more 802.11 drivers than ath9k. Cheers.
I was essentially in a overwork-induced coma for the last 18 months. The work was carried forward by (many, many) others.
I was happy to wake up a few weeks back, and find RFC8290 published, and sch_cake being readied for mainline, and multiple commercial products finally shipping what we'd worked on all these years.
Notice the improvement in bufferbloat score from D to A+.
Subjectively, I have noticed the connection seems more responsive and there no longer seem to be latency spikes when utilising all the upload bandwidth.
It’s certainly worth considering bufferbloat, if you suffer from latency spikes when using all your upload bandwidth (I used to suffer from this a lot more, when I had an ADSL connection, with only 1 megabit upload).
I love the ER-X because it's cheap, generally available and a fine router with SQM. Just keep in mind that with smart queue enabled it will top out at about 150-170Mbit.
If you're on a faster connection the Edgerouter ER4 with its faster quad core CPU should be able to handle up to about 300-350Mbit.
When you need to shape even more bandwidth, routers based on the Marvel XP Armada chipset do very well. I flashed a Linksys WRT1900ACS with OpenWRT and was able to shape about 600-750Mbps before it ran out of horsepower.
I love my ER-X, but once I got my gigabit connection it can't quite keep up compared to plugging directly into the modem. It's close enough that I'm not looking to replace it, but the next time I need a router I may go the NUC build your own route.
How does one go about building a router from a NUC? Don't you need two NIC's for a router? (One for the modem, one to connect a switch for the LAN.) Or have enough people given up on wired networking that they just build routers where the entire LAN is on WiFi?
If you have a VLAN capable switch and appropriate network card with the right drivers, then you can have all the logical ports you could possibly want.
I have done this at my parents place with a cheap Intel Celeron based computer with a single network port, but 2 logical networks with vlanning. One is for their personal network and the other is for their guest suite they offer through Airbnb.
It’s only 20MB/s fibre, so not much cpu power needed. The no name computer cost less than $250, came with a 60gb SSD and I put pfsense on it.
Note the comment "the actual rate limits will be set to 95% of the specified value". That explains why I saw a 5-10% dropoff in throughput when I enabled it with honest numbers.
Note that if you have gigabit or faster it's usually cheaper to build a router box with a really fast CPU. Most/all reasonably priced network applicances can't do FQ in hardware offload and their CPU's are pretty weak. Never managed to get over 200mbps on any network box besides a super beefy PC.
I'm currently behind a Linux router with basically nothing but firewall rules configured; I still get an A on that test (meaningful?). I guess this sort of thing is just the default nowadays?
What router? what ISP? What link technology? What bandwidth? A pointer to your dslreports result? There are plenty of small ISPs that have adopted this stuff... and a few router makers.
I am always happy to hear of a bloat free connection.
You're one of the lucky ones with an ISP with properly configured buffers.
A+ is the goal on that test though, which basically means no extra latency under load.
In some cases, your connection speed may be fast enough that your router's CPU can't keep pace when doing traffic shaping. But if your router is powerful enough, then properly configured SQM will not result in any meaningful reduction in throughput (and can lead to better real-world throughput, by allowing congestion control to work properly). The more you know about the properties of your WAN connection, the more accurately you can configure SQM to account for the true limits of your connection. If you have an ADSL connection, then it helps to tell SQM to take into account ATM framing overhead, for example.
If your network is all Linux hosts, configure BBR TCP congestion control and you won't ever have to worry about bufferbloat again (unless you use a lot of UDP) . There's a ton of research in this area, but to summarize, there's two main ways of controlling outgoing packet rate.
Measuring RTT(round trip time) or packet loss rate. Unfortunately all the early (and still common) TCP protocols control their send rate by measuring packet drop. Sending faster and faster until upstream routers somewhere start dropping traffic. This has the effect of completely filling the outbound buffers of the slowest link in the chain.
To combat this, most bufferbloat fighting algorithms focus on dropping traffic before buffers fill, namely RED (random early drop) and CODEL.
Newer algorithms like TCP Vegas and BBR measure RTT and lower transmit rate when they detect buffers down the line filling, preventing bloat.
In most cases, you still need a router configured to prevent bufferbloat because even a single naughty protocol on your network can fill outgoing buffers.
The most important thing to know about controlling bufferbloat with QoS is that you MUST have an accurate estimation of your max upload/download rate. This is because you can only control bloat if you are in control of the slowest link where it will build first. All effective bufferbloat solutions rely on artificially making your router the slowest link, usually by limiting bandwitdh through them to ~90% actual.
Once you have control of the slowest link you can pick which packets get dropped as the buffers fill. And using something like FQ_CODEL you can assign equal bandwitdh to all IP's on the network. The nice thing about controlling bandwidth this way vs hard speed limits per user is that it allows users to use as much bandwith as they want until staying below line rate requires sharing
Also notice I keep saying outbound. You actually have far less control of bufferbloat on the inbound end. Best you can hope for is that dropping inbound packets coming in over the configured rate will stop whatever server from sending you more as quickly, but this isn't always the case. Luckily, in my experience, the vast majority of "lag" is due to outbound buffers filling, not inbound. So setting up bufferbloat fighting bandwidth sharing is usually extremely effective
1) BBR is currently not something I'd recommend at home.
2) The hope has always been that the core two bufferbloat-fighting algorithms (BQL, and fq_codel) would end up in the cable, fiber or dsl modem hardware, so that no shaping would be required, as there would be sufficient backpressure from the link itself to regulate the link intelligently. The cpu costs on this are nearly 0! BQL is 6 lines of new code in the device driver. fq_codel has been shipping in linux for 6 years. It's just a matter of turning it on...
But: lacking that support from the ISP-supplied gear, we shape with htb + fq_codel, (as you say), to ~90% of the link rate... with another box - or even in the same box if the device driver can't be fixed and is overbuffered. We are painfully aware of how much cpu shaping costs but modern cpus usually have enough oomph to handle it.
btw: We've come up with a new deficit shaper (sch_cake) that lets us get to ~100% of the isp bandwidth (so long as you get the wire framing exactly right), while providing vastly better queue management in the sqm system.
4) fq_codel is fair to flows, not devices. This works well in the general case, but has edge cases where abusive apps that open a lot of flows gain priority. Adding per host fq (while retaining per flow fq), even through nat, was the number 1 request from the users for sch_cake, and one of the main reasons why cake exists.
You can configure FQ_Codel to match on different parts of of the packet. Default matching "tuple" for flows includes source/dest port along with source/dest IP. If you set this to source IP only it should fairly distribute bandwidth based on LAN IP alone. If cake is a normal QDisc you might be able to enable flow-per-ip without adding any code. For an example with FQ CODEL see:
$TC filter add dev $IF_WAN parent 11: handle 11 protocol all flow hash keys nfct-src divisor 1024
“SQM” is shorthand for an integrated network system that performs better per-packet/per flow network scheduling, active queue length management (AQM), traffic shaping/rate limiting, and QoS (prioritization).
“Classic” QoS does prioritization only.
“Classic” AQM manages queue lengths only.
“Classic” packet scheduling does some form of fair queuing only.
“Classic” traffic shaping and policing sets hard limits on queue lengths and transfer rates
“Classic” rate limiting sets hard limits on network speeds.
It has become apparent that in order to ensure a good internet experience all of these techniques need to be combined and used as an integrated whole, and also represented as such to end-users."
Isn't QoS a more general qualitative, perceived phenomenon? SQM sounds like it relates to something you'd do to improve QoS but then only at one link in the chain, at one layer.
I wish! QoS could have been a good term to keep using if the existing deployments of it on the Internet it wasn't hopelessly mapped generally to mere packet prioritization (diffserv) which doesn't actually work on today's internet.
QoE is a better, less overloaded, in the "qualitative, perceived" sort of description.
There's been plethora of other trade names for what we do with htb+fq_codel - streamboost, adaptive qos, etc.
I like that eero, edgerouter, and openwrt and derivatives also call what we do sqm. It simplifies the discussion, and the core scripts for linux generically are available as the sqm-scripts on github.
> deployments of it on the Internet it wasn't hopelessly mapped generally to mere packet prioritization (diffserv)
Sorry yeah, I forgot about how the term has been mangled over the years. I was coming from a 3GPP perspective where the term is specifically defined as experience [0]
In particular:
* only the QoS perceived by end-user matter
* QoS definitions have to be future proof;
* QoS has to be provided end-to-end.
* QoS attributes (or mapping of them) should not be restricted to one or few external QoS control mechanisms
Not the first time the ivory tower of Telecomms has been out of touch with the outside world ...
I cured bhfferbloat with a cheap DD-WRT capable router (NOT Puma 6 chipset). I found it important to limit the WAN bandwidth on my router to about 15% below the max speed I’m paying for. What a difference!
Netgear R6300V2
Using fq_codel because the others use more CPU than I’m confident these devices can handle.
I find that most people suffer from bufferbloat unknowingly, having bought fancy routers 5-10 years ago.
I have a home internet connection that is a resold AT&T U-Verse connection. I doubt any of these fixes are available to me -- the extremely user-hostile equipment provides essentially no user-configurable options, and I have been told it has no switched mode, not even a secret one. So there's no way I can introduce my own router hardware unless I want to be double-NATted.
(Also, it has a broken caching DNS server, and forces that broken server into its DHCP responses; the server it returns is not user-configurable.)
>Twiddling with QoS might help, but a faster internet connection probably won’t help at all.
The key issue here is contention, or a busy egress interface - that's when buffering occurs. I do not understand why adding capacity, or reducing the probability of a busy egress interface, "won't help at all".
totally untrue. fq_codel manages normal torrents just fine in the presence of gamer-style traffic. We tested against torrents in fixing bufferbloat a lot! Pure aqm systems like pie (in docsis 3.1) do pretty well also. cake (per host fq) can make even insane amounts of torrenting (or slashdotting!) bearable.
I think you misunderstood the above poster who was talking about the case without SQM (or I'm misunderstanding things because I've never heard any of these terms before now).
Anyone trying to work out if Mikrotik has some equivalent of this SQM, looks like no. Well, not explicitly fq_codel or any extensions. Though using the sfq queue system with traffic shaping seems to give the same improvement to bufferbloat.
It really is amazing the difference that modern AQMs like FQ-CoDel make. This is a bit of self promotion but we leverage FQ-CoDel in our product (https://www.preseem.com) to help ISPs provide a better experience to their subscribers. Our customers regularly pass along anecdotes from subscribers who are very happy that they can now do big downloads or heavy streaming without breaking interactive applications like gaming.
News to me. dslreports reported a C for bufferbloat. My router (eero) has a "labs" section that had SQM disabled. After enabling, dslreports is reporting an A.
DOCSIS 3.1 devices mandate pie, which helps a lot, but it's not as good as fq_codel, nor do they do shaping from the isp, which is kind of needed for all the cable links in the USA I've tried. Get one if you can, though, they are better across the board in many other ways.
pfsense has fq_codel.
Anything (1000s of routers) from lede/openwrt has the most advanced bufferbloat-fighting stuff in it, followed by dd-wrt, tomato, etc. If you need high bandwidths the multi-core arms are the best. fq_codel is pre-configured on all links (ethernet/usb/fiber/wifi/whatever) but if you need shaping to the ISP provided rate you need to configure it. All the research that went into fixing bufferbloat queuing problems everywhere landed in openwrt and lede first. Most of the research that improved tcp everywhere came out of google.
Most gaming routers now sold commercially have some variant of fq_codel in them in their trade name ISP "qos" system.
Also fq_codel derived anti-bufferbloat work has landed in many commercial wifi routers on the wifi side (eero, google wifi, some ubnt products, meraki, many others). The paper behind all that was: https://arxiv.org/pdf/1703.00064.pdf - happily that work was "good enough" to enable by default, and boy, does it make a difference if wifi is your bottleneck.
The current premier dsl router with cake is evenroute. I think there are several new models from several manufacturers that are going to get it right, soon.
Not tracking FIOS (gpon fiber) closely at the moment. Yes, fiber networks have bufferbloat, but it's harder to hit, and generally smaller than on dsl and cable technologies. I configured cake on sonic fiber recently and got 60ms back. (going from 60ms latency under load to 3ms )
Regrettably shaper setup is finicky and requires a few minutes of testing with a site like dslreports or a tool like flent.org to get right. If more ISPs published their shapers' bitrate and burst rate settings, life would be easier here... but the hope has always been they'd just ship a router with this stuff on and remotely configured to be "right".
In some ways, what are you doing about bufferbloat is
Let's see. The turris omnia is a very good router (but only available in europe). For oomph (gbit shaping) people often leverage lede on a pcengines apu2 or run a full distro of pfsense or linux on it.
In my experience, changing the default DNS servers, enabling sch_cake, and minimizing shared spectrum interference are the most significant improvements for a home WiFi connection. Can anyone think of an additional dimension for improvement, besides upgrading the link itself?
> Can anyone think of an additional dimension for improvement, besides upgrading the link itself?
Depending on the hardware, you can upgrade the link itself with newer, smarter WiFi drivers. After pretty much solving the bufferbloat problem for wired connections, many of the same developers moved on to fixing WiFi, and some fruits of that effort are currently available in OpenWRT and LEDE.
Anyone have any suggestions for pfSense? I played around with the traffic shaper, setting the scheduler type to CODELQ and limiting bandwidth to 95%, but it doesn't seem to do much from what I can tell while testing with the speedtest on dslreports.
Wow the manual test is very convincing: running fast.com's speedtest while pinging google makes the time increase from <30ms to over 1000ms! It's taking 1.5 seconds to ping google? I had no idea this could be happening.
Bufferbloat happens on high speed links like those but amount of bloat you see is in the 30ms - 60ms range (vs seconds(!!) on home links). Bloat happens mostly (aside from microbursts) on overloaded links - and high speed backbones are typically overprovisioned so the problem only shows up when there's an outage or fiber cut. Example of what happens today on a fiber cut: http://blog.cerowrt.org/post/bufferbloat_on_the_backbone/
IF things like fq_codel were deployed on those we'd not see latencies climb that much at all, we'd see bandwidths decline to the actual capacity available - and only the biggest flows would be hit to do so.
FQ_codel is lightweight enough to fit directly in high speed hardware and it does indeed run on 40gigE plus devices on linux, ddpk, BSD, in software. But it takes a long time for new chipsets to adopt new algorithms even if they incorporate support for deeply desirable features like ECN.
That said... 10Gbit to the home.... ooohhhhh. it's really hard to bloat that!
Priority is the wrong way to think about it. Given all the sources of bursts on the internet today, fair queuing (or "flow queuing") has become the way to turn flows back into packets.
there's an awful lot of lit on FQ, what we do with fq_codel is to not only interleave packets better but apply congestion control signals at the right time so competing tcp flows don't overwhelm the link (with under 10ms of buffering (v seconds common on fifo ISP links)).
Of course, being perfectly fair to flows is sometimes undesirable, but making something strictly higher priority[1] is fraught with peril as you end up with a classification nightmare.
Having fq gives you the best shot at smaller flows completing sooner, and of big flows sharing better with each other.
Having vastly reduced buffering improves the responsiveness of competing TCP flows a lot, grabbing bandwidth whenever available, faster.
My take on folk that want "prioritization" is ask them to try some variant of sqm with just fq and codel and get back to us. being fair with well managed buffers works really well.[2]
[1] making something lower priority than best effort is actually a good idea.
[2] but if you really want some flows or devices prioritized, see the sch_cake work mentioned on this thread. I still tend to think per host FQ is what many want rather than attempting to raise the priority of certain flows from certain services.
I have been away from this field from some time. The Cake seems to solve the problem with machines but not the problem with the same machine having streams with different latency/bw requirements. The priority way of solving this problem is round robin on packet priority and priority limit for machine/user (different queues per discritized priority level meaning no bufferbloat for high priority). The primary issue with this solution is that it would require the packets to be labeled.
yep. solved - for 6 years in the sqm-scripts and now in cake.
(not solved, in docsis-pie)
We use diffserv for this, for apps willing to use it.
Example: ssh sets the imm diffserv bit for interactive use. cake respects that (I've cited the relevant paper elsewhere, another place is https://www.bufferbloat.net/projects/codel/wiki/CakeTechnica... but after extensive testing we settled on 3, rather than four tiers of priority)
stuff derived from the sqm-scripts use the same method (using htb + fq_codel) but the problem has always been that diffserv is not respected end to end. However, within your network, you can make your intention known and have it work, if you have the bottleneck.
Also, we have always made the latency/bandwidth tradeoff explicit - if you want less latency, you must want less bandwidth. It's the only safe answer to apps gaming the diffserv markings.
Ya in gaming we usually have 2 packets. Synchronization that must occur and then visual fidelity sync. First one is small bw and less latency. The others can even be dropped with some minor visual desyncing.
In reality I don't see buffer bloat on the internet adding jitter of more than a few milliseconds. I do see loss though or 20, 50, even 100ms. I'd rather have 50ms of jitter than 50ms of loss, but that's just my application.
fq_codel and cake use a tiny bit of packet loss to get a sending host to back off, for example to keep a large download flow within the limits of your home link. Other flows aren't affected.
Bufferbloat regularly adds hundreds of ms on home internet connections, you can get an indication of your bufferbloat on http://www.dslreports.com/speedtest
Not on my connection. I would rather my UDP packet arrive 30ms late than not arrive - especially on high latency links where I want to process the packet before a 300ms round trip nack/retransmit has a chance to work.
I don't see any buffer bloat or excessive jitter on my home internet (at least on wired connection) on BT ftth.
Anecdotally, I've seen larger amounts of packet loss and jitter when TCP accelerates faster into a loss event. The small amount of preventative loss reduces both of these values.
But, I too am allergic to loss. fq_codel supports ECN, (explicit congestion notification), which is enabled, now, universally by IOS. As near as I can tell, the ECN usage of 6% (https://www.ietf.org/proceedings/98/slides/slides-98-maprg-t... ) in france is almost entirely from free.fr's deployment of fq_codel which they enabled by default in 2012 (!!!!!!). I had expected all the ISPs to have lept on this by now....
Yes there is. I have multiple UDP/RTP streams at different bitrates, from the same IP to the same IP.
I then get a loss on a 30mbit stream (3000 packets a second) of 150 packets.
At the exact same time I get a loss on a 20mbit stream of 100 packets, and a 10mbit stream of 50 packets.
This is an outage for 150ms, probably because of a reroute in an MPLS network somewhere.
My packets have already been emmitted by the time any round trip resend would have come back.
Tcp needs packets to be dropped quickly to get the feedback that your link can't handle the speed. Without that, it will appear that the speed of the network is fluctuating (as the buffer fills it appears your network is faster than it is, when the buffer is full it will start dropping packets which causes tcp to back off which causes the buffer to drain, now if tcp starts sending data before the buffer is empty you'll get a different apparent speed to tcp) and it is that fluctuation which causes the issues.
The cake traffic shaper in OpenWRT is amazing for fighting bufferbloat in your home network and it can also do almost perfect fairness in dividing the available bandwidth per LAN host with very little configuration. Just get it as part of the SQM tools in OpenWRT and enable it. For the per-host-fairness take a look at the "Make cake sing and dance" from this link: https://openwrt.org/docs/guide-user/network/traffic-shaping/...
If you use an Edgerouter, you can get the cake traffic shaper but you'll have to do without the easy web interface OpenWRT has: https://community.ubnt.com/t5/EdgeRouter/Cake-and-FQ-PIE-com...