Category: Storage


If you’ve had little or no real-world experience inside a data center or large switched infrastructure, the Cisco Nexus series of switches is something you probably haven’t encountered so far. Yet, they are rather different from ‘standard’ Cisco Catalyst switches like the 3560/2960/3750 series switches which are most commonly used these days in certification training and most business environments. Since I’ve been able to get my hands on them, I’ll share my experiences to the reader. I’ll be focusing on the 5000 and 2000 series, as these show a clear design difference with the Catalyst series.

Nexus

A Nexus 2000 is also called a fabric extender, or FEX. The idea is that they extend the switching fabric of a Nexus 5000 or 7000 (the 7000 is a chassis). A FEX has no management interface, but instead has to be connected to a Nexus 5000 or 7000, after which it becomes a logical part of that parent switch. A 32-port Nexus 5000 with ten 48-port Nexus 2000 attached will list a whopping 512 ports under ‘show ip interface brief’, not counting any VLAN interfaces. All interfaces will show as ‘ethernet’, no matter their link speed, so no guessing ‘was it f0/1 or g0/1’ here.

Connection from FEX to parent switch is done via a SFP module with fiber, or a Cisco twinax cable, which is an ethernet-like copper cable with the SFP already attached to it on both sides. Depending on the FEX model, there are two or four SFP uplinks possible, just like most Catalyst switches.

Twinax

The 5000 series has 32 to 96 1/10 Gbps SFP ports. These ports cannot negotiate any lower than 1Gbps, so 10 or 100 Mbps is not an option. As the parent switch, it is supposed to provide uplinks to other parts of the network, or uplinks to the FEX’s, so high bandwidth is needed. The actual links to the servers are meant to be patched on the FEX’s, which have 24 to 48 100/1000 Mbps ports. 10 Mbps is not possible here. (Frankly, who still uses that?)

An interesting feature is that you can use two 5000 or 7000 together as one logical switch when setting up port aggregation, as long as they have a direct connection between themselves for control. So by using an uplink to another switch or FEX on one Nexus, and using a second uplink on the second Nexus, you can create an Etherchannel, without any of the links getting blocked by STP and without causing a loop. The link between the two Nexus switches will keep information synchronized. This is called a virtual Port Channel or vPC.

Also, they don’t run the classic Cisco IOS, but use NX-OS. While this originally evolved from a different line of operating software, the basic commands are the same as in IOS. Some things are somewhat different, e.g. SPAN or port mirroring requires additional commands. Just for reminder, a SPAN port is configured on a Catalyst switch like this:

switch(config)#monitor session 1 source interface g0/4
switch(config)#monitor session 1 destination interface g0/5

The above will copy all traffic from interface g0/4 to g0/5. If you connect a capturing device on port g0/5 (e.g. a computer with Wireshark running), you can see the traffic. A Nexus works different:

switch(config)#monitor session 1 source interface e111/1/20
switch(config)#monitor session 1 destination interface e1/5
switch(config)#interface e1/5
switch(config-if)#switchport monitor
switch(config-if)#exit
switch(config)#no monitor session 1 shut

By explicitly configuring the switchport as a monitoring interface, there’s less confusion: in the Catalyst series the monitoring switchport can have an entirely different configuration, but it won’t take effect as soon as it becomes a SPAN destination. The monitor session doesn’t start by default, hence the last command. Since you’re working in a multiple gigabits environment, this is an understandable choice.

Using NX-OS has another reason, of course. The Nexus series can run FCoE natively. For more information, read this first. By combining this with servers that have converged network adapters (CNAs) and connecting the Nexus to a SAN, it’s possible to run both storage and IP-based communication through the same physical network.

These are the main reasons Cisco is having success with these lines of switches: they’re very redundant (vPC, dual power supplies, dual fans,…), they provide both LAN and SAN functionality, and have high throughput rates (1/10 Gbps, sub-millisecond switching from server through FEX to parent switch). They are mostly used in an environment that needs large layer 2 domains, like data centers. I’ve also heard of implementations for an access layer design towards many end users, which would work and provide great redundancy, but since these switches weren’t designed with that in mind, they lack PoE capabilities often needed for IP Phones and access points.

Ethernet (and TCP/IP) is the number one used technology these days for communication between devices. But for storage, the dominant technology in a datacenter often is Fibre Channel (abbreviated to FC).

Fibre Channel
FC is a network standard to allow hosts (servers) to communicate with storage devices. By itself, it’s completely separate from Ethernet. A storage network switch is not the same as an Ethernet network switch. There is one notable exception to this rule: the Cisco Nexus 5548UP and 5596UP have switchports that can be run in either Ethernet mode, or Fibre Channel mode, but not both modes at the same time. There’s also no communication between both types of ports possible, as the protocols are incompatible.

One name you’ll hear when talking about storage networking is Brocade: the most prominent vendor of storage networking hardware. Also, a bit of information about the name Fibre Channel: originally, FC’s only transport medium was fiber, but these days twisted pair copper wire is also possible. That’s the opposite of Ethernet, which originally ran only on copper wires and now can be used on fiber as well.

Reliability
Another important difference to remember between Ethernet and FC is the reliability: FC was designed with perfect reliability in mind. Not a single frame may be lost, and frames must be delivered in order, just like they would from a local attached storage device. FC switches even signal when they’re congesting to other devices, so these devices stop sending frames, instead of dropping frames. This in contrast to Ethernet, which will just start dropping frames when congested, relying on upper layers (like TCP) to make sure everything keeps working.

SAN versus NAS
Some people think a Storage Area network, or SAN, is similar to a Network Attached Storage disk, or NAS. This is not true: a NAS provides access to files, a SAN provides access to raw storage. It also doesn’t show up as a network drive in the operating system but as a local attached drive, and it is treated that way too.

Layers and command set
Wikipedia mentions that Fibre Channel does not follow the OSI layer. It’s true but not completely:  a FC frame be divided into layers. The biggest difference is layers 5 to 7 of the OSI model are missing, as FC is raw storage data transport and not related to a particular application. I’ll quote the layers from Wikipedia:

  • FC4: Protocol Mapping layer for protocols such as SCSI.
  • FC3: Common Services layer, a thin layer for encryption or RAID redundancy algorithms.
  • FC2: Network layer, consists of the core of Fibre Channel, and defines the main protocols.
  • FC1: Data Link layer, which implements line coding of signals.
  • FC0: PHY, includes cabling, connectors etc.

On FC4, SCSI or Small Computer System Interface is commonly used. SCSI is a command set to communicate with storage devices. It’s the same command set used between a computer and a local attached SCSI drive (like a SAS drive). FC2 is the network layer and somewhat relates to OSI layer 2 and 3. A SAN is one flat network, best compared to a layer 2 subnet. There are discussions about whether FC is switching or routing, but it’s a bit of both really. Personally, I use the term ‘Fibre Channel switching’ because it’s a flat network. On the other hand, FSPF or Fibre Channel Shortest Path First, is commonly referred to as a routing protocol. Also, it doesn’t use MAC addresses, but World Wide Names (WWNs) to identify source and destination nodes, which are hexadecimal numbers just like MACs.

Bandwidth
FC speeds aren’t in multiples of 10 like Ethernet, but double with each implementation: there’s 1GFC, 2GFC, 4GFC, 8GFC and 16GFC. The ‘G’ stands for Gigabit, as you need high bandwidth for storage. A FC adapter is not like an Ethernet NIC: it doesn’t have an IP, and it will not be treated as a NIC by the operating system, but more like a storage device (which it is).

Fibre Channel over Ethernet
When data centers started to grow, this gave some scalability options when implementing redundancy. Redundancy meant two Ethernet NICs, but also two FC adapters for storage, giving a total of four connections per server. For this reason, Fibre Channel over Ethernet (or FCoE) was developed. FCoE uses Ethernet frames (up to OSI layer 2) and sets FC on top of that (from FC2 and up). The result is a converged network that can transport both device communications and storage blocks.

For this to work you’ll need a Converged Network Adapter (CNA) and switches capable of FCoE. It’s theoretically possible to use a normal NIC and let software calculate the FCoE frames, but few, if any, of these implementations exists. Also, I haven’t found any sources claiming a standard Ethernet switch will or will not work. Most likely they’ll work, but given the unreliable nature of Ethernet, you’ll run into serious problems once congestion occurs, as SCSI does not recover well from lost or out-of-order-delivered frames (most likely your operating system will crash or get corrupted). A FCoE enabled switch, like the Cisco Nexus series for example, provides lossless Ethernet techniques to handle this, and can use FC signalling to prevent congestion.

Fibre Channel over IP
So that’s FCoE, but as this doesn’t use IP, it’s still a flat network. For WAN links, there are other standards too, that can span multiple hops and don’t have distance limitations like native FC. It’s possible to run FC on top of IP, using FCIP or iFCP (Internet Fibre Channel Protocol). Both don’t seem to be commonly used.

iSCSI
One of the more widely used techniques for converged storage networking is iSCSI, which is running SCSI on top of TCP (using ports 860 and 3260). This doesn’t really involve any FC formatting anymore in any part of the frame, so it’s less overhead than FCIP and iFCP, which also run on TCP but then still require FC headers. TCP counters the unreliability of Ethernet, allowing for reliable frame delivery and sequence numbering to prevent out-of-order-delivery. iSCSI also doesn’t require specialized networking gear, allowing for normal Ethernet network equipment. You can even implement QoS and basic firewalling matching on TCP port numbers.

Storage space
SCSI uses Logical Unit Numbers (or LUNs) to differentiate between different (virtual) partitions on a storage device. This means that you can have a large SAN server with several TB of storage, divided into many different LUNs, one for each server. Servers then communicate using SCSI (over any of the above technologies) using LUNs to addresss their part of the storage. This way, servers do not interfere with each other’s storage. Most modern operating systems have support for iSCSI. VMWare’s ESXi and vSphere even implement this on the hypervisor level, making the storage disks appear completely local to the virtual machines.

IP over Fibre Channel
Internet Protocol over Fibre Channel (IPFC) exists too, but it doesn’t seem to be used a lot. Good documentation and drivers are hard to find, so why go through all the trouble? Most companies already have a working Ethernet infrastructure and Ethernet is usually less expensive. This is also another reason why iSCSI is popular: some claim that buying 10 Gigabit-Ethernet switches and NICs is less expensive than buying 8GFC switches and adapters, and the increased overhead of TCP and iSCSI is less than the speed gain from 8 Gbps to 10 Gbps.

This was quite a write, but I hope to have cleared out the basic differences and similarities between these two technologies. Anything to add? Let me know in the comments.