Anycast DNS - Using BGP
In this fifth article on Anycast DNS, we provide some examples of deploying Anycast using Border Gateway Protocol or BGP, the core routing protocol of the Internet.
While BGP is mostly used by Internet Service Providers (ISPs), it is also used in some of the larger enterprise environments that must interconnect networks that span geographical and/or administrative regions and boundaries. Since BGP is a very complex routing protocol, we will provide only a basic recipe using Cisco and Quagga host-based routing software. A detailed discussion of the BGP protocol is beyond the scope of this article.
BGP is an Exterior Gateway Protocol (EGP), which means that it exchanges routing information between Autonomous Systems (AS). BGP is quite different from other IGPs, such as RIP and OSPF. BGP uses a different routing algorithm that uses a path vector algorithm, causing it to keep a list of every AS that the path passes through.
Our recipe will demonstrate how to configure Quagga to peer with a Cisco router using BGP. Suppose our Anycast design consists of an Autonomous System 65500 and AS 64555 as shown below. AS 64555 will contain our Anycast DNS servers and we'll establish peering between the two as shown below:
The recipe calls for configuring an Anycast DNS server each with two physical network connections on different subnets or VLANs. Two upstream routers are configured with BGP routing and will peer with our Anycast DNS server. The Anycast DNS servers will be configured with BGP routing protocol for originating our two Anycast VIPs of 192.168.0.1/32 and 192.168.1.1/32. The configuration is shown in the graphic below:
We could advertise two (2) Anycast VIPs from within the same netblock 192.168.0.0/24, such as 192.168.0.1/32 and 192.168.0.2/32. This would save address space, but we're simply trying to show by example by using VIPs from different netblocks.
Recipe - Multihomed Anycast DNS using BGP
Step 1 - Configure Anycast VIPs on "Server A"
Add two (2) Anycast VIPs to the host's loopback interface as a virtual loopback device or sub-interface. This is performed using the following command:
ifconfig lo:0 192.168.0.1 netmask 255.255.255.255 ifconfig lo:1 192.168.1.1 netmask 255.255.255.255
NOTE: The command above shows the syntax for performing this on Linux. The loopback devices are named slightly different on Sun Solaris. The loopback devices on Solaris are called lo0:0 and lo0:1 respectively.
Step 2 - Configure Zebra (component of Quagga) on "Server A"
The typical location of the zebra configuration file is /etc/quagga/zebra.conf, unless you have built Quagga with non-default file locations. Create the /etc/quagga/zebra.conf file as follows:
! ! Zebra configuration saved from vty ! 2009/06/07 09:49:00 ! hostname server_a ! password zebra enable password zebra ! interface eth0 ip address 10.0.1.10/24 ! interface eth1 ip address 10.0.2.10/24 ! interface lo ! line vty !
Once the zebra.conf file is built, start the zebra process and configure it to start automatically at boot time. With zebra running, we can access the running configuration interactively using the vty or vtysh. Please consult the Quagga on-line help for usage at http://www.quagga.net
Step 3 - Configure BGP on "server_a"
In order to configure BGP routing on server_a, we need to configure the server to run the bgpd routing daemon. The Quagga BGP routing daemon is configured through the /etc/quagga/bgpd.conf file as follows:
! ! bgpd configuration saved from vty !2009/06/13 11:21:42 ! hostname server_a password zebra log stdout ! router bgp 64555 bgp router-id 10.0.3.10 network 192.168.0.1/32 network 192.168.1.1/32 timers bgp 4 16 neighbor 10.0.1.1 remote-as 65500 neighbor 10.0.1.1 next-hop-self neighbor 10.0.1.1 prefix-list DEFAULT in neighbor 10.0.1.1 prefix-list ANYCAST out neighbor 10.0.2.1 remote-as 65500 neighbor 10.0.2.1 next-hop-self neighbor 10.0.2.1 prefix-list DEFAULT in neighbor 10.0.2.1 prefix-list ANYCAST out ! ip prefix-list ANYCAST seq 5 permit 192.168.0.1/32 ip prefix-list ANYCAST seq 10 permit 192.168.1.1/32 ip prefix-list DEFAULT seq 5 permit 0.0.0.0/0 line vty !
Start the BGPD routing daemon and enable the service to start automatically at boot time. Similar to zebra, the BGP process can be maintained and configured by using the vty or vtysh. The only interfaces in our configuration that are actively participating using BGP are eth0 and eth1. They will "peer" with their respective upstream BGP neighboring router. The eth0 peers with router R1-A, and the eth1 interface will peer with the R1-B router.
In our configuration above, we used some of the more advanced BGP configuration directives. Here is a summary of what some of them do:
- "timers bgp 4 16" - this command adjusts the network timers for keepalive and holddown timers. On Cisco routers, this defaults to 60 and 180 respectively. This means that a keepalive is sent every 4 seconds, and the router should wait 16 seconds for keepalive messages before it declares the peer dead
- "neighbor 10.0.1.1 next-hop-self" - This configures "peering" by forcing routing updates to this upstream neighbor
- "neighbor 10.0.1.1 prefix-list DEFAULT in" - this allows the ip prefix-list called "DEFAULT" to propogate the default route to this device
- "neighbor 10.0.1.1 prefix-list ANYCAST out" - this enables our outbound ANYCAST prefix-list to be advertised to our upstream peer
Step 4 - Configure "Server A" upstream router R1-A and R1-B with BGP
The following Cisco configuration were applied to the upstream router R1-A:
interface FastEthernet0/0 description link to BGP AS 65500 ip address 192.168.2.31 255.255.255.0 ! interface FastEthernet0/1 description link to BGP AS 64555 ip address 10.0.1.1 255.255.255.0 ! router bgp 65500 bgp log-neighbor-changes network 10.0.1.0 mask 255.255.255.0 network 192.168.2.0 network 0.0.0.0 timers bgp 4 16 neighbor 10.0.1.10 remote-as 64555 neighbor 10.0.1.10 next-hop-self maximum-paths 4
Perform a similar configuration to router R1-B:
interface FastEthernet0/0 description link to BGP AS 65500 ip address 192.168.2.32 255.255.255.0 ! interface FastEthernet0/1 description link to BGP AS 64555 ip address 10.0.2.1 255.255.255.0 ! router bgp 65500 bgp log-neighbor-changes network 10.0.2.0 mask 255.255.255.0 network 192.168.2.0 network 0.0.0.0 timers bgp 4 16 neighbor 10.0.2.10 remote-as 64555 neighbor 10.0.2.10 next-hop-self maximum-paths 4
At this point, BGP routing should be operational, and our Anycast VIPs should be advertised.
Step 5 - Create Failover Mechanism
In the event that our DNS server process on "Server A" or "Server B" fails, it is desirable to remove the Anycast VIPs from the global routing table. To do that, we must stop the routes from being advertised at their point of origination. A small script can be used to accomplish this by performing cursory checks on the health of the DNS server, and its ability to respond to queries. A simple script is used to detect issues with DNS. The script will issue queries and as soon as they fail, it will simply shutdown our routing daemon(s) or remove the routes from being advertised. The following is an example of what a script might look like:
#!/bin/bash DNSUP=`/usr/sbin/dig @192.168.0.1 localhost. A +short` if [ "$DNSUP" != "127.0.0.1" ]; then echo "Stopping Anycast...." /etc/init.d/bgpd stop /etc/init.d/zebra stop /etc/init.d/named stop else echo "Everything's good... Do nothing..." fi
The script should be scheduled in cron or at to minimize downtime and provide quick failover.
Step 6 - Repeate Steps 1-5 for all other Anycast Servers that are part of this Anycast Group.
Key BGP Troubleshooting Commands
BGP is a complex routing protocol to deploy and maintain, especially in larger enterprise network environments. A great amount of planning time is needed to achieve an efficient routing architecture that provides high availability and fast convergence. As you work with BGP, you will need to rely on a bevy of tools for troubleshooting and validating your BGP routed network. Here are some Cisco IOS commands used in configuring and/or troubleshooting BGP:
show ip bgp summary - shows BGP neighbors in summary mode
R1-A# show ip bgp summary BGP router identifier 192.168.2.31, local AS number 65500 BGP table version is 1, main routing table version 1 6 network entries using 582 bytes of memory 6 path entries using 216 bytes of memory 2 BGP path attribute entries using 120 bytes of memory 1 BGP AS-PATH entries using 24 bytes of memory 0 BGP route-map cache entries using 0 bytes of memory 0 BGP filter-list cache entries using 0 bytes of memory BGP using 942 total bytes of memory BGP activity 6/0 prefixes, 6/0 paths, scan interval 60 secs Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 10.0.1.10 4 64555 4 3 0 0 0 00:00:02 2 10.0.2.10 4 64555 3 3 0 0 0 00:00:00 0
The output shown above displays a lot of useful information, including the local router identifier for router R1-A as 192.168.2.31, the local AS of 65500, and the BGP table version of 1. (An increasing version number indicates a network change is occurring; if no changes occur, this number remains the same.) It also shows six network paths on R1-A, using 582 bytes of memory. Memory is important in BGP because in a large network, such as the Internet, memory can be a limiting factor. As more BGP entries populate the IP routing table, more memory is required. The above output displays two configured remote peers: both are EBGP (because the AS is 64555 and are different the same as the local AS).
show ip bgp - displays the BGP topology table
R1-A# show ip bgp BGP table version is 7, local router ID is 192.168.2.31 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path *> 0.0.0.0 192.168.2.1 0 32768 i *> 10.0.1.0/24 0.0.0.0 0 32768 i *> 10.0.2.0/24 0.0.0.0 0 32768 i *> 192.168.0.1/32 10.0.2.10 0 0 64555 i * 10.0.1.10 0 0 64555 i *> 192.168.1.1/32 10.0.2.10 0 0 64555 i * 10.0.1.10 0 0 64555 i *> 192.168.2.0 0.0.0.0 0 32768 i
The BGP table version is displayed as 7 and the local router ID is 192.168.2.31. The various networks are listed along with the next hop address, metric (MED), local preference (Locpref), weight, and the path. The i on the left side (part of the status codes) indicates an internal BGP route and the i on the right side of our example indicates the origin. (i is for IGP, part of the origin codes.)
show ip bgp neighbors - displays BGP neighbors in detail
R1-A# show ip bgp neighbors BGP neighbor is 10.0.1.10, remote AS 64555, external link BGP version 4, remote router ID 10.0.1.10 BGP state = Established, up for 00:05:07 Last read 00:00:02, hold time is 16, keepalive interval is 4 seconds Configured hold time is 16, keepalive interval is 4 seconds Neighbor capabilities: Route refresh: advertised and received(old & new) Address family IPv4 Unicast: advertised and received Message statistics: InQ depth is 0 OutQ depth is 0 Sent Rcvd Opens: 1 1 Notifications: 0 0 Updates: 2 1 Keepalives: 79 63 Route Refresh: 0 0 Total: 82 65 Default minimum time between advertisement runs is 30 seconds For address family: IPv4 Unicast BGP table version 7, neighbor version 7 Index 3, Offset 0, Mask 0x8 Sent Rcvd Prefix activity: ---- ---- Prefixes Current: 6 2 (Consumes 72 bytes) Prefixes Total: 6 2 Implicit Withdraw: 0 0 Explicit Withdraw: 0 0 Used as bestpath: n/a 0 Used as multipath: n/a 2 Outbound Inbound Local Policy Denied Prefixes: -------- ------- Total: 0 0 Number of NLRIs in the update sent: max 4, min 0 Connections established 1; dropped 0 Last reset never Connection state is ESTAB, I/O status: 1, unread input bytes: 0 Local host: 10.0.1.1, Local port: 179 Foreign host: 10.0.1.10, Foreign port: 48101 Enqueued packets for retransmit: 0, input: 0 mis-ordered: 0 (0 bytes) Event Timers (current time is 0x5E1F8): Timer Starts Wakeups Next Retrans 84 0 0x0 TimeWait 0 0 0x0 AckHold 67 64 0x0 SendWnd 0 0 0x0 KeepAlive 0 0 0x0 GiveUp 0 0 0x0 PmtuAger 0 0 0x0 DeadWait 0 0 0x0 iss: 915421219 snduna: 915422937 sndnxt: 915422937 sndwnd: 5840 irs: 4113695520 rcvnxt: 4113696868 rcvwnd: 15037 delrcvwnd: 1347 SRTT: 300 ms, RTTO: 303 ms, RTV: 3 ms, KRTT: 0 ms minRTT: 0 ms, maxRTT: 300 ms, ACK hold: 200 ms Flags: passive open, nagle, gen tcbs Datagrams (max data segment is 1460 bytes): Rcvd: 152 (out of order: 0), with data: 67, total data bytes: 1347 Sent: 148 (retransmit: 0, fastretransmit: 0), with data: 83, total data bytes: 1717 BGP neighbor is 10.0.2.10, remote AS 64555, external link BGP version 4, remote router ID 10.0.1.10 BGP state = Established, up for 00:05:19 Last read 00:00:04, hold time is 16, keepalive interval is 4 seconds Configured hold time is 16, keepalive interval is 4 seconds Neighbor capabilities: Route refresh: advertised and received(old & new) Address family IPv4 Unicast: advertised and received Message statistics: InQ depth is 0 OutQ depth is 0 Sent Rcvd Opens: 1 1 Notifications: 0 0 Updates: 1 1 Keepalives: 82 65 Route Refresh: 0 0 Total: 84 67 Default minimum time between advertisement runs is 30 seconds For address family: IPv4 Unicast BGP table version 7, neighbor version 7 Index 4, Offset 0, Mask 0x10 Sent Rcvd Prefix activity: ---- ---- Prefixes Current: 4 2 (Consumes 72 bytes) Prefixes Total: 4 2 Implicit Withdraw: 0 0 Explicit Withdraw: 0 0 Used as bestpath: n/a 2 Used as multipath: n/a 2 Outbound Inbound Local Policy Denied Prefixes: -------- ------- Bestpath from this peer: 2 n/a Total: 2 0 Number of NLRIs in the update sent: max 4, min 0 Connections established 1; dropped 0 Last reset never Connection state is ESTAB, I/O status: 1, unread input bytes: 0 Local host: 10.0.2.1, Local port: 179 Foreign host: 10.0.2.10, Foreign port: 39231 Enqueued packets for retransmit: 0, input: 0 mis-ordered: 0 (0 bytes) Event Timers (current time is 0x60E88): Timer Starts Wakeups Next Retrans 88 0 0x0 TimeWait 0 0 0x0 AckHold 69 51 0x0 SendWnd 0 0 0x0 KeepAlive 0 0 0x0 GiveUp 0 0 0x0 PmtuAger 0 0 0x0 DeadWait 0 0 0x0 iss: 2991828195 snduna: 2991829917 sndnxt: 299 1829917 sndwnd: 5840 irs: 4144867550 rcvnxt: 4144868936 rcvwnd: 14999 delrcvwnd: 1385 SRTT: 300 ms, RTTO: 303 ms, RTV: 3 ms, KRTT: 0 ms minRTT: 0 ms, maxRTT: 300 ms, ACK hold: 200 ms Flags: passive open, nagle, gen tcbs Datagrams (max data segment is 1460 bytes): Rcvd: 157 (out of order: 0), with data: 69, total data bytes: 1385 Sent: 139 (retransmit: 0, fastretransmit: 0), with data: 87, total data bytes: 1721
The output above shows the BGP neighbors in greater detail.
This concludes our high-level recipe on using BGP to configure Anycast DNS services. It also marks the final article in the Anycast DNS Recipe Series.