Tuesday, 17 June 2014

By Far The Biggest Issue I Encounter in Wi-Fi Deployments Is…

… high airtime utilisation caused by infrastructure support for low data rates. Actually, it is one of the two biggest issues that I see however the solution for this issue is far more plug and play than the other. If you’re interested, the other issue is poor coverage caused by automatic AP transmit power algorithms (RRM) – more on that in a future post.

Now, this will not be a revelation for anyone working in Wi-Fi. Unfortunately the vast majority of WLANs are deployed by people with minimal Wi-Fi knowledge. This post is not a criticism of those folks however. The purpose of this post is also not to delve into the technical details behind this issue but to look at the party that could essentially solve (or at the very least, significantly reduce) this issue overnight – the enterprise Wi-Fi vendors.

Out of the box, all enterprise vendors equipment that I’ve looked at ship with the lowest data rates enabled – the 1 and 2 Mbps rates from the original 802.11 standard and the 5.5 and 11 Mbps rates that the 802.11b amendment brought us. These rates came about 17 and 15 years ago, respectively and yet vendors still ship equipment supporting these rates by default despite the devastation they cause.

The term ‘junk band’ is sometimes used to describe the 2.4 GHz band that these data rates operate in and as a reason why Wi-Fi often performs poorly in this band. The huge irony here is that many Wi-Fi deployments are by far the biggest contributor of ‘junk’ in the band – the junk of course are these airtime-hogging frames. Yes, non-Wi-Fi interference does consume some airtime (in many cases, less than it did when Wi-Fi was much younger!) and yes, APs outside of the customers WLAN also consume a portion of airtime but often it is these low rates supported by the customers own WLAN that consumes by far the largest amount of airtime. In addition, most non-Wi-Fi interfering devices also do not operate 24/7 so the interference is sporadic. Many low data rate frames are operating as long as clients are using the WLAN (for example, 8 hours in the day) whilst others are sucking up airtime, 24/7/365!

You may be thinking that it isn’t the vendor’s job to design the WLAN for the customer and that the vendor stresses the importance of disabling these low rates through documentation, training and vendor seminars. Whilst this is all true, it clearly isn’t enough or this wouldn’t be such a massive issue. The complexity of Wi-Fi is only matched, inversely, by the degree to which it is poorly understood. It just isn’t fair to push all of the blame on the customer. If the WLAN was deployed by a VAR then it may be fair to push some of the blame in their direction however once again, the reality is that most VARs, like customers, have minimal Wi-Fi knowledge.

These default rates also hurt the vendors. Numerous times the Wi-Fi vendors are blamed for a performance issue that is simply a result of a poor WLAN design. If these rates were disabled out of the box it would be one less (but significant) issue that uninformed customers could throw back at the vendor.

These are certainly other default, 'out of the box' pieces of configuration that I feel should be changed so why single this one out? Simply put, no other default that I’ve come across causes anywhere near as many issues and on such a large scale. Not only does it affect the customers WLAN but also the neighbouring business and home users.

Despite all vendors having qualified staff that realise this is an issue, why haven’t they made a change? Most likely because, yes, there are still some 802.11b (and even some problematic 802.11g/n) clients out there and by disabling these rates out of the box, these clients will be unable to associate. But so what? Out of the box, many things have to be configured to work and this will just be another. For example, there is a good chance that the out of the box WLAN you create will only support WPA2/AES by default. So if you have clients that only support WPA/TKIP they’re not going to be able to associate. You’re going to have to change those defaults to support your legacy clients. How is this any different? In fact it would be preferable if clients could NOT associate due to these issues. At least this way, the problem would be identified and fixed before the WLAN went into production. Most low data rate utilisation issues persist for a long time, often years, many of which will never be fixed. 

It doesn’t have to be a brute force approached either. I can see a number of options to  ease customers into a life of low channel utilization!
  • A setup wizard used to create the WLAN could ask whether the customer has any 802.11b clients that need to be supported (or problematic 802.11g/n clients that won’t associate with the rates disabled – ok, most customers won’t know this until they flip the switch!) .
  • The setup wizard could ask what vertical the WLAN is being deployed in and if one of the likely candidates (retail, warehousing and healthcare) are chosen, suggest that low data rates may need to be left enabled but that the customer should start at 11 Mbps and work backwards. If these verticals are not selected, the low rates are disabled.
  • If customers do enable low data rates, the wizard might suggest that this configuration could have a significant negative impact on their WLAN and that if they must support low rates, to minimise the number of WLANs advertised on each AP.
  • Back in the day Cisco APs shipped with the default SSID of tsunami configured. Cisco obviously realized this was something of a security issue and removed the SSID from the default configuration, shipped with the radios disabled and put a nice bright yellow sticker on the AP box informing the customer of the fact. Maybe APs could have such a sticker or a slip put into the top of the AP or, where applicable, WLC box. 
So now you’re thinking, “oh but the 5 GHz band will save us!” A recent piece of customer trouble-shooting showed why this is short sighted; the issue – severe performance problems. One of the first things I checked was the airtime utilisation reported by the APs; the highest I saw was a new record for me, an AP at 93% channel utilisation (beating my old record by 1%!). The rest of the APs weren’t much better. I couldn’t work it out at first though; 80% of the clients were associating to the 5 GHz band where the utilization was typically low so why such big performance issues? A look at client association history showed the majority of clients were fluctuating between bands. Yes, a driver update would likely have helped somewhat, but even the latest clients with the latest drivers may still prefer the 2.4 GHz band – I saw this often with early dual-band 802.11n clients. Whilst it’s been 7 years since these initial 802.11n clients came about and more client vendors have started to prefer 5 GHz over 2.4 GHz, this is not a universal truth. I expect a large enough percentage of 802.11ac clients will still make significant use of the 2.4 GHz band and therefore the importance of ‘cleaning up’ the band remains. This trouble-shooting experience was certainly not unique; I’ve seen this many times.

Yes, this proposed solution won’t help with all Wi-Fi airtime issues - non-Wi-Fi interference, external and internal ACI and CCI from SOHO rogues, non-overlapping ACI (AP co-location), sticky clients, clients probing at low rates, clients probing for every WLAN they’ve ever associated to, overly high AP density, overly high transmit power… the list goes on. It will however help with one significant Wi-Fi issue and one that has a very simple plug and play solution. Would this have been advisable 10 years ago? Of course not! Even 5 years ago? Perhaps not even then. But 17 years is a long time in technology circles – it’s time we moved on!

Lastly, I need to acknowledge the fantastic proposal from Cisco’s Brian Hart and Andrew Myles. In short, they’re proposing that the Wi-Fi Alliance start looking at making low data rates optional. Whilst I suspect that the onus behind this proposal may have come from the issues seen in stadium Wi-Fi in recent years (1 Mbps probes + very high client density + very open space = choas), it would obviously benefit all new Wi-Fi deployments where the equipment had this certification. But this leads to the next logical question, “how about cleaning up Cisco’s backyard first?” Obviously this isn’t a Cisco-specific issue but even if this proposal sees the light of day, it will likely be several years before we see it bear fruit. Why wait? – take the lead!

Despite the marketing claims of 802.11ac, Wi-Fi in the 2.4 GHz band is going to be around for the foreseeable future and it’s about time the mess was cleaned up!

No comments:

Post a Comment