Tag Archive: MPLS


Advantages of MPLS: an example.

While MPLS is already explained on this blog, I often still get questions regarding the advantages over normal routing. A clear example I’ve also already discussed, but besides VRF awareness and routing of overlapping IP ranges, there’s also the advantage of reduced resources required (and thus scalability).

WANEdgeDesign

Given the above design: two routers connecting towards ISPs using eBGP sessions. These in turn connect to two enterprise routers, and those two enterprise routers connect towards two backend routers closer to (or in) the network core. All routers run a dynamic routing protocol (e.g. OSPF) and see each other and their loopbacks. However, the two middle routers in the design don’t have the resources to run a full BGP table so the WAN edge routers have iBGP sessions with the backend routers near the network core.

If you configure this as described and don’t add any additional configuration, this design will not work. The iBGP sessions will come up and exchange routes, but those routes will list the WAN edge router as the next hop. Since this next hop is not on a directly connected subnet to the backend routers, the received routes will not be installed in the routing table. The enterprise routers would not have any idea what to do with the packets anyway.

Update January 17th, 2014: the real reason a route will not be installed in the routing table is the iBGP synchronisation feature, which requires the IGP to have learned the BGP routes through redistribution before using the route. Still, synchronisation can be turned off and the two enterprise routers would drop the packets they receive.

There are a few workarounds to make this work:

  • Just propagating a default route of course, but since the WAN edge routers are not directly connected to each other and do not have an iBGP session, this makes the eBGP sessions useless. Some flows will go through one router, some through the other. This is not related to the best AS path, but to the internal (OSPF) routing.
  • Tunneling over the middle enterprise routers, e.g. GRE tunnels from the WAN edge routers towards the backend routers. Will work but requires multiple tunnels with little scalability and more complex troubleshooting.
  • Replacing the middle enterprise routers by switches so it becomes layer 2 and the WAN edge and backend routers have a directly connected subnet. Again this will work but requires design changes and introduces an extra level of troubleshooting (spanning tree).

So what if MPLS is added to the mix? By adding MPLS to these 6 routers (‘mpls ip’ on the interfaces and you’re set), LDP sessions will form… After which the backend routers will install all BGP routes in their routing tables!

The reason? LDP will advertise a label for each prefix in the internal network (check with ‘show mpls ldp bindings’) and a label will be learned for the interfaces (loopback) of the WAN edge routers… After which the backend routers know they just have to send the packets towards the enterprise routers with the corresponding MPLS label.

And the enterprise routers? They have MPLS enabled on all interfaces and no longer use the routing table or FIB (Forwarding Information Base) for forwarding, but the LFIB (Label Forwarding Information Base). Since all packets received from the backend routers have a label which corresponds to the loopback of one of the WAN edge routers, they will forward the packet based on the label.

Result: the middle enterprise routers do not need to learn any external routes. This design is scalable and flexible, just adding a new router and configuring OSPF and MPLS will make it work. Since a full BGP table these days is well over 450,000 routes, the enterprise routers do not need to check a huge routing table for each packet which decreases resource usage (memory, CPU) and decreases latency and jitter.

Advertisement

Another series of articles. So far in my blog, I’ve concentrated on how to get routed networks running with basic configuration. But at some point, you may want to refine the configuration to provide better security, better failover, less chance for unexpected issues, and if possible make things less CPU and memory intensive as well.

While I was recently designing and implementing a MPLS network, it got clear that using defaults everywhere wasn’t the best way to proceed. As visible in the MPLS-VPN article, several different protocols are used: BGP, OSPF and LDP. Each of these establishes a neighborship with the next-hop, all using different hello timers to detect issues: 60 seconds for BGP, 10 seconds for OSPF and 5 seconds for LDP.

First thing that comes to mind is synchronizing these timers, e.g. setting them all to 5 seconds and a 15 second dead-time. While this does improve failover, there’s three keepalives going over the link to check if the link works, and still several seconds of failover. It would be better to bind all these protocols to one common keepalive. UDLD comes to mind, but that’s to check fibers if they work in both directions, it needs seconds to detect a link failure, and only works between two adjacent layer 2 interfaces. The ideal solution would check layer 3 connectivity between two routing protocol neighbors, regardless of switched path in between. This would be useful for WAN links, where fiber signal (the laser) tends to stay active even if there’s a failure in the provider network.

BFD-Session

Turns out this is possible: Bidirectional Forwarding Detection (BFD) can do this. BFD is an open-vendor protocol (RFC 5880) that establishes a session between two layer 3 devices and periodically sends hello packets or keepalives. If the packets are no longer received, the connection is considered down. Configuration is fairly straightforward:

Router(config-if)#bfd interval 50 min_rx 50 multiplier 3

The values used above are all minimum values. The first 50 for ‘interval’ is how much time in milliseconds there is between hello packets. The ‘mix_rx’ is the expected receive rate for hello packets. Documentation isn’t clear on this and I was unable to see a difference in reaction in my tests if this parameter was changed. The ‘multiplier’ value is how many hello packets kan be missed before flagging the connection as down. The above configuration will trigger a connection issue after 150 ms. The configuration needs to be applied on the remote interface as well, but that will not yet activate BFD. It needs to be attached to a routing process on both sides before it starts to function. It will take neighbors from those routing processes to communicate with. Below I’m listing the commands for OSPF, EIGRP and BGP:

Router(config)#router ospf 1
Router(config-router)#bfd all-interfaces
Router(config-router)#exit
Router(config)#router eigrp 65000
Router(config-router)#bfd all-interfaces
Router(config-router)#exit
Router(config)#router bgp 65000
Router(config-router)#neighbor fall-over bfd
Router(config-router)#exit

This makes the routing protocols much more responsive to link failures. For MPLS, the LDP session cannot be coupled with BFD on a Cisco device, but on a Juniper it’s possible. This is not mandatory as the no frames will be sent on the link anymore as soon as the routing protocol neighborships break and the routing table (well, the FIB) is updated.

Result: fast failover, relying on a dedicated protocol rather than some out-of-date default timers:

Router#show bfd neighbor

NeighAddr                         LD/RD    RH/RS     State     Int
10.10.10.1                       1/1     Up        Up        Fa0/1

Jun 29 14:16:21.148: %OSPF-5-ADJCHG: Process 1, Nbr 10.10.10.1 on FastEthernet0/1 from FULL to DOWN, Neighbor Down: BFD node down

Not bad for a WAN line.

MPLS, part II: VRF-aware MPLS-VPN.

Where MPLS part I explains the basics of labeling packets, it’s not giving any advantage over normal routing, apart from faster table lookups. But extensions to MPLS allow for more. In this article I’ll explain MPLS-VPN, and more specifically a Virtual Private Routed Network (VPRN).

A VPRN is a routed (layer 3) network over an MPLS cloud, that is VRF-aware, or customer-aware. This means several different routing instances (VRFs, remember?) can share the same MPLS cloud. How is this achieved when there’s only a point-to-point link between routers with one IP address? After all, an interface can only be assigned to one VRF. Solution: by adding a second MPLS label to the data: the ‘outer’ label (the one closest to the layer 2 header) is used to specify the destination router, the ‘inner’ label (closest to the original layer 3 header) is used to specify to which VRF a packet belongs. The outer label workings are identical to standard MPLS: these are learned by LDP and matched with a prefix in the routing table. But for the inner label, a VRF-aware process needs to run on each router that can handle label information and propagate it to other routers. That process is Multiprotocol-BGP, or MP-BGP.

MPLS-VPN-Header

The outer label is used to route the packet through the MPLS cloud, and the last router(s) use the inner label to see to which VRF a packet belongs. Let’s look at the configuration to understand it more. First, the basic setup:

MPLS-VPN-VRF

Notice the router names, as these are often used in MPLS terminology.

  • The Customer Edge router a router that directly connects to a customer network. It’s usually the demarcation point, where the equipment governed by the MPLS provider begins. Contrary to the name, the CE itself is often managed by the provider as well.
  • The Provider Edge router is the ‘first’ router (seen from a customer site point of view) that has MPLS enabled interfaces. It’s where the labels are applied for the first time.
  • A Provider router is a router completely internal to the MPLS cloud, having only MPLS enabled interfaces.

The connection between CE and PE is a point-to-point so a /30 subnet is logical. Routing between CE and PE is done using a simple routing protocol, like RIP, OSPF, EIGRP or even static or standard BGP. The only notable part is that the PE router has to use a VRF to place the customer in. Below an example configuration:

Router-CE(config)#interface G0/1
Router-CE(config-if)#ip address 10.1.0.1 255.255.255.252
Router-CE(config-if)#exit
Router-CE(config)#router ospf 1
Router-CE(config-router)#network 10.1.0.0 0.0.0.3 area 0

Router-PE(config)#vrf definition VRF
Router-PE(config-vrf)#address-family ipv4
Router-PE(config-vrf-af)#exit
Router-PE(config-vrf)#exit
Router-PE(config)#interface G0/1
Router-PE(config-if)#vrf forwarding VRF
Router-PE(config-if)#ip address 10.1.0.2 255.255.255.252
Router-PE(config-if)#exit
Router-PE(config)#router ospf 2 vrf VRF
Router-PE(config-router)#network 10.1.0.0 0.0.0.3 area 0
*Mar  1 00:06:20.991: %OSPF-5-ADJCHG: Process 2, Nbr 10.1.0.1 on GigabitEthernet0/1 from LOADING to FULL, Loading Done

If you happen to run a router on an IOS before 15.0, the commands for the VRF change: it becomes ‘ip vrf VRF’ to define a VRF, without the need to specify an address family, as 12.x IOS versions aren’t VRF aware for IPv6. On the interface, the command is ‘ip vrf forwarding VRF’.

So far so good. Now on to activating MPLS between the PE and the P router, and making sure the routers learn the MPLS topology:

Router-PE(config)#interface G0/2
Router-PE(config-if)#mpls ip
Router-PE(config-if)#ip address 10.0.0.1 255.255.255.252
Router-PE(config-if)#ip ospf network point-to-point
Router-PE(config-if)#exit
Router-PE(config)#interface Loopback0
Router-PE(config-if)#ip address 10.0.1.1
Router-PE(config-if)#exit
Router-PE(config)#router ospf 1
Router-PE(config-router)#network 10.0.0.0 0.0.1.255 area 0

Router-P(config)#interface G0/1
Router-P(config-if)#mpls ip
Router-P(config-if)#ip address 10.0.0.2 255.255.255.252
Router-P(config-if)#ip ospf network point-to-point
Router-P(config-if)#exit
Router-P(config)#interface Loopback0
Router-P(config-if)#ip address 10.0.1.2 255.255.255.255
Router-P(config-if)#exit
Router-P(config)#router ospf 1
Router-P(config-router)#network 10.0.0.0 0.0.1.255 area 0
*Mar  1 00:02:29.023: %OSPF-5-ADJCHG: Process 1, Nbr 10.0.0.1 on GigabitEthernet0/2 from LOADING to FULL, Loading Done
*Mar  1 00:02:33.127: %LDP-5-NBRCHG: LDP Neighbor 10.0.0.1:0 (1) is UP

The ‘ip ospf network point-to-point’ is not really needed but used to reduce OSPF overhead. The loopbacks are needed for BGP later on.

Up until this point, we have MPLS in the default VRF and a separate VRF per customer for routing, but no routing of the VRFs over the MPLS. To exchange the inner labels needed to specify the VRF, MP-BGP between PE and P is configured:

Router-PE(config)#router bgp 65000
Router-PE(config-router)#neighbor 10.0.1.2 remote-as 65000
Router-PE(config-router)#neighbor 10.0.1.2 update-source Loopback0
Router-PE(config-router)#no address-family ipv4
Router-PE(config-router)#address-family vpnv4
Router-PE(config-router-af)#neighbor 10.0.1.2 activate
Router-PE(config-router-af)#neighbor 10.0.1.2 send-community extended

Router-P(config)#router bgp 65000
Router-P(config-router)#neighbor 10.0.1.1 remote-as 65000
Router-P(config-router)#neighbor 10.0.1.1 update-source Loopback0
Router-P(config-router)#no address-family ipv4
Router-P(config-router)#address-family vpnv4
Router-P(config-router-af)#neighbor 10.0.1.1 activate
Router-P(config-router-af)#neighbor 10.0.1.1 send-community extended
*Mar  1 00:11:31.879: %BGP-5-ADJCHANGE: neighbor 10.0.1.1 Up

Again some explanation. First off, BGP neighbors always need to be defined under the main process, after which they need to be activated for a specific address-family. The ‘no address-family ipv4’ command means that no conventional routing information for the default VRF will be exchanged (we already have OSPF for that). The ‘address-family vpnv4’ activates the VPRN capability, and the label exchange for VRFs. In this process, the neighbor is activated. The ‘send-community extended’ means BGP will exchange the community Path Attributes (PA), in which the label information is present. The loopbacks are used to connect to each other, not only for redundancy in case a physical interface should go down, but also because LDP does not exchange labels with another router on a connected subnet for that subnet. This means that if the directly connected interfaces are used as BGP neighbors, the BGP process can’t figure out the labeling properly.

So VRF-aware MPLS is running now, but on each PE router it needs to be specified which VRF needs to be injected in the MPLS cloud. MP-BGP does this using import and export of routing tables. For MP-BGP, a route needs to be uniquely identified, and explained to which VRF it belongs. This is done with a Route Distinguisher (RD) and Route Target (RT). Both are 64 bits, and usually in the format AS:nn, or the first 32 bits the AS number and the last 32 bits a unique chosen number.

  • RD uniquely identifies a route. Inside MP-BGP, a route is prepended with its RD, e.g. 65000:1:192.168.1.0/24. This way, if the route exists twice (in different VRFs), it’s still unique because the RD part of the prefix is different.
  • RT specifies to which VRF a route belongs. It handles the import and export of routes from a VRF to the BGP process. In its basic form, it’s the same number as the RD, and the same at all PE routers for a certain client.

Configuration of these parameters is done inside the VRF:

Router-PE(config)#vrf definition VRF
Router-PE(config-vrf)#rd 65000:1
Router-PE(config-vrf)#route-target both 65000:1

Now that the VRF can be used in the BGP process, it is imported in the process as following:

Router-PE(config)#router bgp 65000
Router-PE(config-router)#address-family ipv4 vrf VRF
Router-PE(config-router-af)#redistribute ospf 2

The redistribution, of course, needs to be mutual between OSPF and BGP, so a few more lines of configuration are needed to complete everything:

Router-PE(config)#router ospf 2 vrf VRF
Router-PE(config-router)#redistribute bgp 65000 subnets

MPLS-VPN-Forwarding

And now everything is completed: the PE router learns routes from the P router by OSPF, and redistributes it into BGP to propagate them over the MPLS cloud. From this point on, configuration is modular: on another PE router, the configuration is likewise. Adding a P router isn’t any different from the P router in this example, the processes and parameters are the same each time. Do remember that routers don’t advertise iBGP-learned routes to other iBGP peers, so the PE and P routers need to form a full mesh, unless you’re using route reflectors or confederations.

MPLS, part I: basic labeling

Multi Protocol Label Switching. Plenty of sites to explain it, yet I feel it’s something to cover on my blog, it just wouldn’t be complete without it. What is it? In the most simple form: it’s the use of labels, inserted on frames between layer 2 and 3, and use those labels to make forwarding decisions instead of the original layer 3 header.

Why is it done? Well, in the late 1990’s switching in hardware was possible, but routing (layer 3 forwarding) in hardware wasn’t. By using labels as a layer 3 IP header replacement, these labels could be used in more efficient forwarding tables, and eventually also in hardware. This meant lower latency in routed networks.

So that’s why up until this day, it is still discussed whether MPLS is a layer 2.5 or layer 3 protocol. I’m not going to make a judgement on that: by the standards it’s a layer 3 protocol (it’s above the layer 2 header, and ip-in-ip and gre over ip are not called layer 2.5 or 3.5), but on the other hand, the original use is to replace the use of the actual layer 3 header in forwarding, so 2.5 is logical in that regard.

MPLS-Header

The MPLS header is 4 bytes or 32 bits (just like 802.1q!). Routers need to agree on which labels they will use towards each other. For this, they need to set up a protocol that exchanges label information: Label Distribution Protocol (LDP). Cisco also has a proprietary Tag Distribution Protocol, but it’s no longer widely used. LDP sends Hello/Discover frames on UDP port 646 to 224.0.0.2 (all routers multicast). Once neighbor routers see each other, they build up a session on TCP port 646. The actual labels are automatically assigned and agreed on through these sessions. To enable MPLS and LDP on an interface, you’ll need one command:

Router(config-if)#mpls ip

But the labels by themselves aren’t enough. Each label represents a destination: in a basic environment, an IPv4 destination subnet. To know which label is needed for which destination, each MPLS router needs to know the possible destinations so it can assign a label to it: it needs a routing protocol. Generally speaking, it’s a better move to choose a routing protocol with awareness of the topology, so a link state protocol like OSPF or IS-IS.

MPLS-Switching

Once the routers know all routes and have set up TCP sessions for LDP, they can agree on which labels to use the reach a certain destination subnet. If the first router in the above picture receives a normal IPv4 packet, it sees in the forwarding table (the Forwarding Information Base, or FIB) that the next destination is out of a MPLS enabled interface, with a label value of 33 appended to it. If the second router receives a MPLS packet with label value 33, it can do a quick lookup using it’s Label Forwarding Information Base (LFIB) and find a next hop out of another interface with a new label value. This process continues until it reaches the last router of the MPLS region, where the LFIB next hop points to an interface without MPLS. The MPLS header is removed and the packet is forwarded as usual.

The above explains basic MPLS functionality. But it’s not everything: labels can stack, allowing for more complex configurations. In the next part, I’ll explain the use of multiple labels to create VRF-aware forwarding.

A little bit of everything.

Yes, a bit of everything, that’s what it has been lately. First, I’m upgrading my home lab switches with more recent IOS versions. The 3560 on my desk can now run EIGRP for IPv6. My 2970 gigabit switch will follow tomorrow, with a K9 IOS this time to make it accessible via SSH.

Second is that I’ve been fine tuning my knowledge of layer 2 security features, using my 3560 desk switch and a 3750 test switch as subjects. RA Guard works great, and so does DHCP Snooping. DHCP Snooping has revealed a third functionality to me, next to countering rogue DHCP servers and preventing DHCP flooding: it also detects when a MAC address that sends an INFORM cannot be present on that port according to the mac address-table. It will generate a ‘%DHCP_SNOOPING-5-DHCP_SNOOPING_MATCH_MAC_FAIL’ message and drop the frame. Seems to be a functionality related to ARP Inspection.
And ARP Inspection, on the other hand, requires some planning of your DHCP servers: if multiple are present and they all reply at the same time, the DHCP Snooping feature, on which ARP Inspection relies, sometimes picks the wrong packet to add to it’s binding table. The client device selects another packet of the ones it received to configure itself, and thus ARP Inspection thinks there’s spoofing going on. I’m still figuring out how to effectively counter that.

Third is that I’ve ordered the CCIE Routing and Switching Certification Guide, 4th Edition hardcover, so I have a lot of reading to be done soon. I have to admit that I don’t like to read ebooks on a big screen so far, and I’m reluctant to buy a reader.
Yesterday I also tried a MPLS lab for the first time, with BGP-MP in GNS3. It did take me several hours but I managed to get it running. Not bad for never having done anything MPLS related before. Still, it’s a huge topic and I’ll need to learn a lot more about that.

And last, I tested an Aruba Remote Access Point (RAP). I’ve already tested Instant Access Points. The RAP works different: once booted, it needs an internet connection. When connecting a computer (it has LAN interfaces, just like a consumer-grade router), it redirects to a setup page, where you have to enter the public IP address of a Wireless LAN Controller (WLC). It then tries to negotiate a tunnel through NAT-T over UDP port 4500 to that WLC. It works by encapsulating IPsec in a UDP header, bypassing any NAT devices that are incapable of keeping the NAT state of IPsec.
The RAP tries to authenticate itself at the WLC using his MAC address. After whitelisting it and configuring a wireless profile (which contains the list of SSIDs to send out), I had to reboot the RAP. I ended up rebooting it several times, thinking it didn’t work, but eventually it turned out my cable had broken due to all the times I plugged it in and out again. The RAP booted fine and started sending out the correct SSIDs. Initially, the wireless connection didn’t hand out an IP to me, but after five minutes, everything suddenly got an IP and started working as if there had never been a problem. Not sure why this happened, although I suspect my NAT router of dropping some of the UDP packets (which wouldn’t be the first time).

A little bit of everything indeed.