Tag Archive: Multi-Vendor


I’m going to take a risk here and challenge myself to a discussion that has been active for years among data center network engineers: eliminating Spanning-Tree Protocol from the data center.

To the readers with little data center experience: why would one want to get rid of STP? Well, a data center usually has a lot of layer 2 domains: many VLANs, many servers. That translates to many switches. Those switches require redundant uplinks to the core network, and maybe each other too. Using STP, you can achieve such redundancy, but at the cost of putting links in blocking state. Okay, MST and PVST+ allow you to assign root switches per VLAN or group of VLANs and do some load-balancing across the links, but it can still result in inefficient switching where a frame goes through more switches than it needs to. Port-channels use all links, but they are not perfect and can only be configured between two (logical) devices.

STP Network

If you’re still not convinced, take the above example. The red and blue line are two connections. While the blue connection does take the shortest path, the red one does not. This is because STP puts some links in blocking state to prevent loops. You can change the root bridge, but choosing a root bridge that is not in the center of the network makes things worse.

Outside of the data center, there is not such a need for improved switching: smaller networks like SOHO or medium size companies don’t have large layer 2 topologies so STP is sufficient, and an ISP, while having a large infrastructure, often uses layer 3 protocols for redundancy because the number of hosts doesn’t matter, only routing data does.

There are several technologies being developed to improve switching and most are based on the proposed TRILL standard: TRansparent Interconnect of Lots of Links. Switches use a link-state protocol that can run on layer 2, IS-IS, which does not carry any IPv4 or IPv6 routes in this implementation, but MAC address locations.

TRILL header

Frames are encapsulated with a TRILL-header (which has it’s own hop count to prevent switch loops) and are sent to the switch that is closest to the destination MAC address. If the entire layer 2 topology runs TRILL, that means it will be sent to the switch that is directly connected to the destination MAC address. The TRILL header is then removed to allow normal transportation of frames again. The entire frame is transported as payload in the TRILL header, even the 802.1q VLAN tag is left unchanged. Basically, it is like ‘routing frames’. For broadcasts and multicasts, a multicast tree is calculated to allow the copying of frames without causing loops. Reverse-Path Forwarding (RPF) checks are also included to further reduce potential switching loops. Also, equal-cost multipath is supported, so load-balancing over multiple links within the same VLAN is possible.

While TRILL is a proposed IETF standard, 802.1aq SPF or Shortest Path Bridging is an IEEE standard. It works very similar and also uses IS-IS with multicast trees for broadcasts and unknown unicasts, as well as RPF checks. The mayor difference is in the encapsulation: either 802.1ah MAC-in-MAC or 802.1ad QinQ is used. MAC-in-MAC encapsulates the frame in another ‘frame’ with source and destination MAC addresses of switches, while QinQ manipulates VLAN tags to define an optimal path. The actual workings are quite complex but can be studied on Wikipedia.

Of course, new protocols in the networking world always come as vendor-specific too, and naturally Cisco is part of it. Cisco uses FabricPath, which is a custom TRILL implementation that can already be configured the Nexus 5500 and 7000 series. It also claims that all switches capable of FabricPath today will be able to support TRILL too once it’s an official standard.

Brocade, originally a SAN vendor with lots of products in the data center network market now, has VCS Fabric Technology, which also is a custom TRILL implementation. While some suggest it does not scale that well compared to other products, it does come with some auto-configuration features that make it easy to deploy.

Juniper also has a vendor-specific protocol to move away from STP: QFabric. It’s not TRILL-based, making it less likely to allow multi-vendor support on existing hardware in the future. A QFabric works like one giant managed switch: it’s like a large chassis made up of multiple physical devices, in some aspects like the Nexus 2000 FEX’es, which act as line cards for a parent 5000 or 7000. Multiple QFabric’s can be connected and appear to each other as one switch, allowing it to scale out very well. For an in-dept explanation, EtherealMind has explained it very well.

Which one is best? Future will tell: all these technologies are relatively new (TRILL isn’t even official yet), and not widely deployed yet. It also depends on what infrastructure you have in mind, the budget, and vendor preference.

Advertisements

Broadcast pings: do they work?

I’ve often seen discussions of ‘how to find devices in the network using pings’. A ping sweep is easiest, but some people claim that a simple ping to the subnet broadcast address will make all devices respond. I decided to test this out for myself and see what happened.

I created a subnet with as many different devices as I could get my hand on at the time. The devices were the following:
– Windows 7 physical machine
– Windows XP physical machine
– Windows Server 2008 R2 virtual machine
– Fedora 15 virtual machine
– Vyatta 6.1 virtual machine
– Cisco 2611 router
– Cisco VG200 Voice Gateway
– Two Cisco 3560 Layer 3 switches
– A Linksys router with DD-WRT firmware
– The ISP-provided gateway: a Motorola with NAT
– Cisco 7912 IP Phone
– And finally, one iPod, for a total of 12 devices having an IP address.

I ran Wireshark on the physical machines (Windows 7 and Windows XP) from which I was going to originate the pings. I also did pings from some of the Cisco devices. Contrary to Windows, IOS will list all replies received when sending to a broadcast address. All devices received an IP address in the 192.168.0.0/24 range, the pings were done to 192.168.0.255.

I did several tests and also changed IP addresses several times between tests to ensure ARPs were sent around the network, which made it easier to follow the captures on Wireshark. The results showed a clear separation between network devices and end devices: the Cisco gear (with the exception of the IP Phone) would respond to broadcast pings, as well as the DD-WRT. All other devices wouldn’t respond to broadcast pings. The Vyatta and the ISP gateway are also network devices, but I have no control over the gateway, and the Vyatta is actually nothing more than a stripped-down Linux and thus may react as an end device in this regard. To be sure I didn’t make a mistake, I did unicast pings after this to the addresses that didn’t respond, and they all reacted fine. No firewall issues here.

There’s still a difference between a ping sweep and a broadcast ping, even if just done towards network devices: a ping sweep will trigger ARP requests for each address, to which devices will respond if they have the address, whether ICMP pings are blocked or not. So after a ping sweep, just doing ‘arp -a’ in the Windows command line reveals all managed network devices.

That answers one question, but what about IPv6? Are things different there? A ping sweep is nearly impossible. A common /64 subnet is 1.8×10^19 addresses, with EUI-64 (see a perfect explanation about EUI-64 on Packetlife.net) you can exclude some addresses, leaving ‘just’ 2.8×10^14 possible combinations. Multicast pings would be the only feasible option to scan a subnet. Note that I say multicast, as IPv6 has no concept of broadcast. Since most of my devices do not have IPv6 support for the moment (I’m planning on upgrading them in the future), I’m left with the Vyatta, Windows 7, Windows Server, and Fedora for this test. But the test results are similar to IPv4: a ping to FF02::1 (‘all nodes’ multicast) does not give a single reply, but a ping to FF02::2 (‘all routers’ multicast) gives a reply from the Vyatta, which is indeed configured for IPv6 routing. So with limited testing I can conclude for now that it’s also just network devices that respond to ping in IPv6.

To return to the original claim that a broadcast ping will reveal all devices in a given subnet: the conclusion is that this only goes for network devices, and not for end devices. It’s probably not an effective way to quickly map a subnet in everyday life.

Layer 2 device discovery methods.

Readers pursuing a CCNA (or higher) certification are most likely familiar with CDP: Cisco Discovery Protocol. CDP runs directly on layer 2 (without the use of IP addresses) in the network and will map neighbouring devices.

CDP works great on a Cisco-only network to find out the network topology, and makes negotiating links with Cisco IP Phones easy. Despite the name, it was never made proprietary. As such, tools exists for Linux (and if you insist on trying it, here‘s a nice guide) and several of Cisco’s competitors have IP Phones supporting CDP.

There are a lot of other vendor-implementations of the protocol, for example Extreme Discovery Protocol by Extreme Networks (of which I found something on the Wireshark wiki), but in a multi-vendor environment, the best solution is LLDP (802.1ab): Link-Layer Discovery Protocol, the vendor-neutral device discovery protocol. Cisco switches have CDP enabled by default, and LLDP disabled, but it is present. The commands for both are very similar:

Switch(config)#cdp run
Switch(config)#exit
Switch#show cdp neighbors

Switch(config)#lldp run
Switch(config)#exit
Switch#show lldp neighbors

Naturally for LLDP there are also some Linux deamons, a list of which can be found here (thanks to Wikipedia for the link). But what about Windows? That seems to be a different story: while software does exist, it’s quite simple and a full version is not free. For CDP, the best thing to use is Tallsoft’s CDP client. The free version gives the company’s website as the device name in CDP, but it’s the only version of CDP software I got to work on Windows.

For LLDP, the only software I got to work eventually was the haneWIN LLDP client. It’s a 30 day trial but works nicely, as illustrated below.
LLDP

Note that I’m connected by Ethernet here, not wireless. Though my wireless gateway did forward CDP frames (as do all CDP unaware devices), it did not do so for LLDP frames, despite not advertising LLDP itself. Since the wireless gateway is provided by my ISP, I have no further control over it.

Finally, there’s one more protocol that’s commonly used to map devices on layer 2: LLTD or Link Layer Topology Discovery, a Windows proprietary protocol. It’s activated by default on Windows Vista and Windows 7, and it’s responsible for giving the visual representation in the Windows Network & Sharing Center in the Control Panel. But since it’s only available on those two Windows versions, the image will not display any Linux or Apple computers, as well as older Windows version like Windows XP. Also, it has no idea how the rest of the network looks like, so it makes assumptions based on what it believes the network should normally look like. In my case, my 3560 switch is nowhere to be found, but another switch is listed connected to a wireless access point, which is not there (though I suspect it is simply a separate representation of the build-in switchports in my wireless gateway).

So conclusion: a layer 2 device discovery protocol that works on all devices exist. I was able to get CDP running on all devices, but LLDP is a better choice given the broader support in network devices. Not surprising since this is why LLDP was created. What is surprising though, is the generally poor support for these protocols in Windows. Just one piece of software per protocol, a small monopolistic market. It seems like there never was a need for large-scale deployments of these protocols, so the market never fully developed.

VRRP between Cisco and Vyatta.

I already mentioned in an earlier post that I was doing some experiments with Virtual Router Redundancy Protocol on routers from different vendors. For those of you not familiar with VRRP: it’s a protocol that allows multiple routers to share the same IP address, which then can be used as the default gateway for end devices. This gives you some redundancy in case a router goes down. VRRP is the IETF standard for an earlier protocol: HRSP, which is available on Cisco devices only. For more info, Wikipedia is your friend.

I already managed to get VRRP running on both GNS3 and real Cisco routers, but since it’s supposed to be a standard, why not try it in a multivendor environment? My favorite non-Cisco router is Vyatta: it’s a stripped-down Linux with nothing left but the kernel and network-related packages already installed. The command line handles somewhat like a Cisco IOS. Since it can run on almost any x86 hardware, you can virtualize it too, so it’s an easy solution in my lab. The basic version is free.

I followed the guide I found on openmaniak.com and got it running in no time. I used the following configuration: 192.168.0.2 for the Vyatta, 192.168.0.3 for the Cisco router (a 2611 running 12.3 IOS), and 192.168.0.5 as the virtual IP address.

I started and configured my Vyatta first. Here you see it sending VRRP multicasts to 224.0.0.18, to announce to the other router(s) that he’s currently the master and will handle all packets send to 192.168.0.5, the virtual address.

VRRP Vyatta

Next, I booted and configured the Cisco router. Note that both configurations used the ‘preempt’ command, which means that if a ‘better’ router is present in the subnet, it will immediately assume the master role, instead of waiting until the current leader (the Vyatta) goes down. The ‘better’ router here means the one with the higher priority, or in case of a tie, the one with the highest IP address.

Since the Cisco router has a higher IP address, it takes the master role after a few seconds:

After the Cisco router becomes the master, it will handle any packet destined for 192.168.0.5. Should the router fail, for example by me unplugging the ethernet cable, the Vyatta router will take the master role again and the address 192.168.0.5 will stay reachable.

So far so good. But I did come across a small problem in my tests. if you watch both images closely, you’ll notice that the Cisco router is using a source MAC address of 00:00:5e:00:01:01. This is correct, because this is the MAC address that must be used according to the RFC. (The last ’01’ is the VRRP group 1. Had I configured it with VRRP group 5, it would be ’05’.)

The Vyatta router does not use this MAC address but instead uses his own (00:0c:29:fd:d5:23, VMWare virtual NIC). I’ve done some research around the web and could not find anything conclusive, but I’ve heard of Linux versions having trouble using multiple source MAC addresses, so this may be the cause. It does create a problem though, because if a router fails in this configuration, the end devices are left with the wrong ARP information, making the 192.168.0.5 address unreachable after all. It’s possible to solve this issue by sending a Gratuitous ARP packet in case of a failure, but I didn’t notice such a packet in my tests, and it would still make things more complex than they are supposed to be. At this moment I am uncertain if VRRP works well in Vyatta. But that aside, I did learn a lot today.