When dealing with large layer 2 broadcast domains, loop prevention becomes one of your main concerns. Loops cause broadcast storms, and broadcast storms bring down networks.

But loop prevention in a data center is not simple. For one, you’re not in control of all devices capable of forming a loop. You’re not even in control of every device capable of generating spanning tree BPDU’s. And then there are vSwitches, which are switches but don’t use spanning-tree (an in-dept discussion of this can be found on Ivan Pepelnjak’s blog, here and here). To finish it off, Nexus 2000 Fabric EXtenders have BPDU Guard hardcoded, although I’ve been informed this might change.

Despite the above gaps in uniformity for spanning-tree, it’s still considered mandatory to run it on the network. Without it, things would be much worse and a proper spanning-tree design with an appropriate type (MST, RPVST) can prevent most issues. BPDU Guard towards (non-hypervisor) servers is a second rule of thumb.

But BPDU Guard only activates upon receiving a BPDU, and loops don’t always carry BPDU’s. This is where storm-control comes in the picture: it checks the number of broadcast, multicast and unknown unicast frames over a 1 second period. If it reaches a configured threshold, any following offending packets are dropped in that 1 second interval. Optionally, the port can be err-disabled, and snmp traps can be sent. Configuration is as following:

Switch(config-if)#storm-control broadcast level pps 100
Switch(config-if)#storm-control multicast level 2.00
Switch(config-if)#storm-control unicast level bps 1m
Switch(config-if)#storm-control action shutdown
Switch(config-if)#storm-control action trap

This is just a sample config: broadcast, multicast and unknown unicast can be configured separately. The threshold can be expressed in percentage of total bandwidth (2.00), packets per second (pps 100) and bits per second (bps 1m, or 1,000,000). There is no real guideline for good values, because it depends on your network. The more you know which traffic flows through it, the better you can set a good value. It’s not a perfect solution, but it’s a measure of last resort for when a loop does occur. The ‘shutdown’ action should only be configured on the same links where you would want BPDU Guard.

So are inter-switch links safe then? Unfortunately not, due to the architecture of switches that forward in hardware.

DataAndControlPlane

The hardware forwarding uses ASICs that only know how to forward. This is referred to as the data plane. The CPU does not do any forwarding (well, it can, but it shouldn’t), and is referred to as the control plane. It monitors the switch health, processes BPDUs, CDP frames, and optionally for a layer 3 switch, the routing protocols and the routing table (but again, not the actual layer 3 forwarding). If the control plane takes a hit and crashes (bug, exploit, sometimes hardware), the control plane will crash. But chances are the data plane will not. And without control plane to limit it, the data plane will happily start forwarding any frame entering the switch out of every other port, regardless of VLAN, CAM table or layer 3 headers. Even ports configured as layer 3 will start flooding.

You might notice ICMP replies mentioned in the picture. This can be an attack vector on a layer 3 switch: generating many frames with low TTL will cause it to generate ICMP TTL Expired messages. Combined with ping replies and incoming packets for non-existing hosts in connected subnets, which will generate ARP requests, this can burden a switch CPU beyond normal expectations.

When this happens, storm control with SNMP traps is one of your best options to quickly finding and limiting the problem on a large layer 2 domain. Another option in the initial design is using actual routers for layer 3 boundaries if possible: the interfaces are hardcoded layer 3 and will not start flooding when the control plane crashes.

Advertisements