Saturday 22 October 2011

WLAN Controller Discovery Best Practices

The methods that APs use to discovery the WLAN controller(s) in a Cisco environment are relatively well known. Andrew vonNagy has documented them quite thoroughly. Take a look at his post if you are not familiar with these methods.

Whilst Andrew has covered the details of how these discovery methods work I thought I would provide my best practices as to when you may use one method for controller discovery over another. As I have referenced Andrew's post, I will copy his list of discovery methods verbatim in order to maintain a blog post SOE ;).

WLAN controller discovery methods
1.       Broadcast on the local subnet
2.       Local NVRAM list of the previously joined controller, previous mobility group members, and administrator primed controller through the console
3.       Over the Air Provisioning (OTAP) (subsequently removed in version 6.0.170.0 code)
4.       DHCP Option 43 returned from the DHCP server
5.       DNS lookup for "CISCO-CAPWAP-CONTROLLER.localdomain"

Broadcast
I never rely on this method nor would I recommend it. I have encountered issues with its use though these may be bugs long since squashed, none the less I prefer not to rely on it. The only time I would consider it is if I the server team was not willing or able to modify DNS or DHCP in order to implement the DNS or DHCP discovery methods. Though you can bypass the server team and run DHCP on a router, this means you are likely not maintaining consistency with your SOE and additionally Option 43 on a router is quite messy. If the controller is on a different subnet to the APs you will need to configure an ip-helper. This will result in the WLAN controller receiving other forwarded protocols (by default this includes, TFTP, DNS, NETBIOS, DHCP, TACACS amongst a few others however these defaults can be disabled using 'no ip forward-protocol up <protocol>').

Local NVRAM or Primed
Whilst the APs can store details in NVRAM of controllers they have previously associated to, they have to get this information to begin with so this method is not going to work as your only method of discovery. In other words, know that it exists but use another method.

APs can also be primed which involves manually configuring them with an IP, Mask, Gateway and WLAN controller address. The APs IP address could in fact come via DHCP and then you would just prime it with the controllers IP address. But if you have a DHCP server available why not configure DHCP Option 43? Perhaps such a scenario exists but it wouldn't be common to do this in any enterprise WLAN deployment. But then, priming more than half a handful of APs in any enterprise deployment would likewise be a pain. The only time I prime an AP is when I have a one or two APs to be deployed for use in a temporary office or when I have done WLAN training and needed to get an AP up and running before the server-side infrastructure was in place.

I find that sometimes a company will associate an AP with a local controller in order to download the appropriate code and then ship the AP to site for connection to the remote sites controller. Once on-site the AP is powered up and attempts to connect to a controller. The issue with this comes with the AP having saved the local controllers details to NVRAM and may result in the AP connecting to the local controller if the primary controller configuration is not performed on the AP, instead of the remote sites own controller. I am unsure what the logic is here as the code upgrade will be performed over the sites LAN so no time or WAN bandwidth is saved. Basically, don't bother as there is no benefit and it will just result in more trouble-shooting on your part. The number of times I have seen APs connect to the wrong controller due to a poorly thought out design...

Keep in mind that APs can also find a controller based on the mobility group of any controller they have already associated with. I was recently trouble-shooting an issue that involved an AP joining the wrong controller. After verifying DHCP, DNS and priming methods I checked the mobility group of both controllers. Sure enough they were in the same mobility group - the local AP had learned about the remote controller by means of the local controller passing the details on. A typo in the primary and secondary controller configuration meant that it disregarded these controllers and used the information it learnt via the mobility group configuration. The secondary issue here was a design issue - the controllers should not have been in the same mobility group. 

Over the Air Provisioning (OTAP)
No longer available nor should it have been in my opinion - it was always a bad idea.

DHCP Option 43
OK, now we get to the first of the two recommended methods of controller discovery. Option 43 seems to be the most common method used for discovery. I recommend its use whenever you have a controller at each of your sites and therefore you want each of the 'Site XX APs' to connect to the 'Site XX WLAN controller'. In each of the DHCP scopes for each site, configure Option 43 for the particular models of APs at that site, pointing the APs to the appropriate sites controller.

The problem with the DHCP discovery method can come when you need to configure Option 43 for each model of AP at each of your sites. Not an issue with one site and one model of AP. But take 30 sites, 30 DHCP scopes, potentially 30 DHCP servers and a few models of AP at each site. In addition, in 18 months when you have left the business, is someone going to know that some DHCP modifications are required when a new model of AP is rolled out? If you have a separate WLAN controller at each site however, then it really is the best method despite the extra admin overhead. On the other hand, if you have either a centralised controller or a single / pair of controllers at a single site, I would recommend using the DNS method.

Another time to look at the DNS method is if you happen to be hosting DHCP on one of your routers. The Option 43 router configuration is messy compared with the ease of configuration involved in getting DHCP Option 43 up and running on a Windows Sever 2003 / 2008 box so stick with DNS in this case.

DNS
The time to use DNS is when all of your APs connect back to the same WLAN controller(s). This could either be remote sites using a centralised controller or perhaps you just have a single site. Let’s say you have 30 sites. Instead of configuring 30 DHCP scopes or 30 DHCP servers with Option 43, just configure an A or CNAME record for CISCO-CAPWAP-CONTROLLER.localdomain pointed at your centralised controller(s). If any new model of AP is rolled out, you don't have to change anything regarding your discovery method, unlike with the DHCP method. This assumes of course that all of your sites fall under the same .localdomain. You can a create global scope option in Windows Server 2003 / 2008 but then Option 43 will apply to all your scopes - not just the AP scopes. This may or may not be of concern.

If however your 30 sites each have their own controller and assuming you DON'T have a separate sub-domain for each site you are going to have problems with APs associating to the wrong controller. You would need to create 30 A or CNAME records for the 30 WLAN controllers but how do you direct APs to the correct controller? You can't do it in any reasonable fashion. Yes, you can specify primary and secondary controllers but this occurs after the AP has associated. You could also configure the master controller option but this is messy. Someone will without a doubt forget to specify the primary controller at some point and you will end up with APs associated to the wrong controller. Inexperienced engineers trouble-shooting the issue will scratch their head as to why the AP has not associated - not realising that it has but is associated to another controller.

Logging
Though not directly related to controller discovery, I will discuss a little know DHCP option that can greatly aid in trouble-shooting AP join issues.

Out-of-the box APs transmit syslogging messages as limited broadcasts (255.255.255.255). When you have an AP that is receiving a DHCP address but is not associating to the controller you may need to console in to see where the issue lies - not very convenient as the AP is typically mounted or often at another site. You cannot telnet or ssh as this is disabled by default and you cannot enable it until after the AP has associated.

The solution - configure DHCP Option 7. This will cause the APs to transmit their logging messages to your syslog server. You can now grep your logs to try and establish why the AP cannot associate.

EOF... almost

Priming - use it when you need to temporarily setup a small number of APs
DHCP - use it when DHCP Option 43 can easily be configured and when you have multiple sites each with their own controller and a global domain used throughout all of your sites.
DNS - use it when you have a centralised controller, a single site or controllers at each site and a sub-domain for each of the sites. This is my preferred method where possible.

No comments:

Post a Comment