Category: Hardware


I have to admit, this article will sound a bit like an advertisement. But given that Cisco has gotten enough attention on this blog already, it can only bring variation into the mix.

A short explanation of a series of different products offered by F5 Networks. Why? If you’re a returning reader to this blog and work in the network industry, chances are you’ll either have encountered one of these appliances already, or could use them (or another vendor’s equivalent of course).

F5-LTM

LTM
The Local Traffic Manager’s main function is load balancing. This means it can divide incoming connections over multiple servers.
Why you would want this:
A typical web server will scale up to a few hundred or thousand connections, depending on the hardware and services it is running and presenting. But there may be more connections needed than one server can handle. Load balancing allows for scalability.
Some extra goodies that come with it:

  • Load balancing method: of course you can choose how to divide the connections. Simply round-robin, weighted in favor of a better server that can handle more, always to the server with the least connections,…
  • SSL Offloading: the LTM can provide the encryption for HTTPS websites and forward the connections in plain HTTP to the web servers, so they don’t have to consume CPU time for encryption.
  • OneConnect: instead of simply forwarding each connection to the servers in the load balancing pool, the LTM can set up a TCP connection with each server and reuse it for every incoming connection, e.g. a new HTTP GET for each external connection over the same inbound connection. Just like SSL Offloading, it consumes fewer resources on the servers. (Not every website handles this well.)
  • Port translation: not really NAT but you can configure the LTM for listening on port 80 HTTP or 443 HTTPS while the servers have their webpage running on different ports.
  • Health checks: if one of the servers in the pool fails, the LTM can detect this and stop sending connections to the server. The service or website will stay up, it will just be able to accept fewer resources. You can even upgrade servers one by one without downtime for the website (but make sure to plan this properly).
  • IPv6 to IPv4 translation: your web servers and entire network does not have to be IPv6 capable. Just the network up to the LTM has to be.

F5-ASM

ASM
The Application Security Manager can be placed in front of servers (one server per external IP address) and functions as an IPS.
Why you would want this:
If you have a server reachable from the internet, it is vulnerable to attack. Simple as that. Even internal services can be attacked.
Some extra goodies that come with it:

  • SSL Offloading: the ASM can provide the encryption for HTTPS websites just like the LTM. The benefit here is that you can check for attack vectors inside the encrypted session.
  • Automated requests recognition: scanning tools can be recognized and prevented access to the website or service.
  • Geolocation blocks: it’s possible to block out entire countries with automatic lists of IP ranges. This way you can offer the service only where you want it, or stop certain untrusted regions from connecting.

GTM
The Global Traffic Manager is a DNS forwarding service that can handle many requests at the same time with some custom features.
Why you would want this:
This one isn’t useful if the service you’re offering isn’t spread out over multiple data centers in geographically different regions. If it is, it will help redirect traffic to the nearest data center and provide some DDoS resistance too.
Some extra goodies that come with it:

  • DNSSec: secured DNS support which prevents spoofing.
  • Location-based DNS: by matching the DNS request with a list of geographical IP allocations, the DNS reply will contain an A record (or AAAA record) that points to the nearest data center.
  • Caching: the GTM also caches DNS requests to respond faster.
  • DDoS proof: automated DNS floods are detected and prevented.

F5-APM

APM
The Access Policy Manager is a device that provides SSLVPN services.
Why you would want this:
The APM will connect remote devices with encryption to the corporate network with a lot of security options.
Some extra goodies that come with it:

  • SSLVPN: no technical knowledge required for the remote user and works over SSL (TCP 443) so there’s a low chance of firewalls blocking it.
  • SSO: Single Sign On support. Log on to the VPN and credentials for other services (e.g. Remote Desktop) are automatically supplied.
  • AAA: lots of different authentication options, local, Radius, third-party,…
  • Application publishing: instead of opening a tunnel, the APM can publish applications after the login page (e.g. Remote Desktop, Citrix) that open directly.

So what benefit would you have from knowing this? More than you think: many times when a network or service is designed, no attention is given to these components. Yet they can help scale out a service without resorting to complex solutions.

On to a bigger platform: the 6500 series, Cisco’s flagship Campus LAN switch. Unlike the previously discussed platform, the 6500 series uses the older Weighted Round Robin (WRR) queueing mechanism, and uses CoS internally to put packets in queues.

Queues and thresholds
The capabilities per port also differ per line card and unlike the 3560/3750 series, it uses multiple ASIC per line card.

Switch#show interfaces GigabitEthernet 1/2/1 capabilities | include tx|rx|ASIC
Flowcontrol:                  rx-(off,on,desired),tx-(off,on,desired)
QOS scheduling:           rx-(1q8t), tx-(1p3q8t)
QOS queueing mode:    rx-(cos), tx-(cos)
Ports-in-ASIC (Sub-port ASIC) : 1-24 (1-12)

The above output is of a WS-X6748-GE-TX line card. The ‘1p3q8t’ for egress (tx) means one fixed priority queue and three normal queues, each with eight thresholds. The fixed priority queue cannot be changed to a normal queue: if a packet is in the queue, it will be transmitted next.

QoS9

The ingress ‘1q8t’ means there is one ingress queue with eight thresholds. Unlike the 3560/3750, there is some oversubscription on the line card. It has two ASICs, one per 24 ports (the line card is 48 ports total). Each of these ASICs has a 20 Gbps connection to the 6500 backplane. If all 24 gigabit ports together on that part of the line card start receiving more than 20 Gbps of traffic, the ASIC and backplane connection will not be able to handle all the traffic. Granted, this is a rare event: 24 Gbps maximum throughput on a 20 Gbps capable ASIC is an oversubscription of 1.2 to 1. But in case this happens, the different thresholds can help decide which traffic to drop. However, to drop on ingress, the decision must be made on existing markings.  The ASIC does classification and remarking, and the ingress queue is before the ASIC. This is not a problem usually, since classification and marking is best done at the access layer and 6500s are best used for distribution and core layer.

Switch#show interfaces TengigabitEthernet 1/9/1 capabilities | include tx|rx|ASIC
Flowcontrol:                  rx-(off,on),tx-(off,on)
QOS scheduling:           rx-(1p7q2t), tx-(1p7q4t)
QOS queueing mode:    rx-(cos,dscp), tx-(cos,dscp)
Ports-in-ASIC (Sub-port ASIC) : 1-8 (1-4)

The WS-X6716-10GE, a 10 GE line card, has different queues, especially for ingress. This line card has a high oversubscription of 4:1 and one ASIC per eight ports, for a total of two ASIC for the 16-port line card. This means that, while eight ports can deliver up to 80 Gbps, the ASIC and backplane connection behind it are still just 20 Gbps. The ASIC is much more likely to get saturated, so ingress queueing becomes important here. The fixed priority queue allows some traffic to be handled by the ASIC in low latency, even when saturated.

I’m only going to explain these two line cards, the rest is similar. A full list with details per line card can be found here. The logic is similar to the 3560/3750 platform: configure the buffers and the thresholds, but this time for both ingress and egress. First the ingress queue on the gigabit interface. The ingress queue has no buffer sizing command, as this line card has only one ingress queue.

Switch(config)#interface Gi1/2/1
Switch(config-if)#rcv-queue threshold 1 65 70 75 80 85 90 95 100
Warning: rcv thresholds will not be applied in hardware.
To modify rcv thresholds in hardware, all of the interfaces below
must be put into ‘trust cos’ state:
Gi1/2/1 Gi1/2/2 Gi1/2/3 Gi1/2/4 Gi1/2/5 Gi1/2/6 Gi1/2/7 Gi1/2/8 Gi1/2/9 Gi1/2/10 Gi1/2/11 Gi1/2/12
Switch(config-if)#

That configures the eight thresholds for the first and only queue: threshold 1 at 65%, threshold 2 at 70%, and so on. Note the warning: for ingress queueing, existing cos markings have to be trusted. Also, remember that the 3560/3750 ingress and buffer allocation commands work switch-wide, because it has one ASIC per switch. The X6748 line card on a 6500 has two ASIC, which for QoS are sub-divided in two sub-ASIC per twelve ports. Applying a command that changes the ASIC QoS allocations means that the command will automatically apply to the other twelve interfaces as well.

Next, egress queueing. First configuring buffer allocations, next the thresholds for the first queue, similar to the ingress queue.

Switch(config-if)#wrr-queue queue-limit 70 20 10
Switch(config-if)#wrr-queue threshold 1 65 70 75 80 85 90 95 100
Switch(config-if)#exit

Again something special here: the buffer allocation command ‘wrr-queue queue-limit’ needs only three values despite four queues. This is because queue 4, the priority queue, is a strict priority queue: any packet entering it will be serviced next. This means that if a lot of traffic ends up in the priority queue, it can end up clogging the other queues because these will not be serviced anymore. The only way to counter this is to tightly control what ends up in that queue.

On to the 10 GE line card. First ingress, this time with a buffer command because there are multiple queues on ingress.

Switch(config)#interface Te1/9/1
Switch(config-if)#rcv-queue queue-limit 40 20 20 0 0 0 20
Warning: rcv queue-limit will not be applied in hardware.
To modify rcv queue-limit in hardware, all of the interfaces below
must be put into ‘trust cos’ state:
Te1/9/1 Te1/9/2 Te1/9/3 Te1/9/4
Switch(config-if)#
Switch(config-if)#rcv-queue threshold 2 80 100
Switch(config-if)#rcv-queue threshold 3 90 100
HW-QOS: Rx high threshold 2 is fixed at 100 percent

Propagating threshold configuration to:  Te1/9/1 Te1/9/2 Te1/9/3 Te1/9/4
Warning: rcv thresholds will not be applied in hardware.
To modify rcv thresholds in hardware, all of the interfaces below
must be put into ‘trust cos’ state:
Te1/9/1 Te1/9/2 Te1/9/3 Te1/9/4
Switch(config-if)#end

The 10 GE line card has one ASIC per eight ports, one sub-ASIC for QoS per four ports. It has two thresholds per queue. The rest isn’t any different from previous configurations.

Queue mappings
As mentioned already, internally the 6500 platform uses CoS to determine in which queue a packet (and thus flow) ends up, although some newer line cards can work with both CoS and DSCP. The mappings are again similar to previous configurations:

Switch(config)#interface Gi1/2/1
Switch(config-if)#rcv-queue cos-map 1 4 5
Propagating cos-map configuration to:  Gi1/2/1 Gi1/2/2 Gi1/2/3 Gi1/2/4 Gi1/2/5 Gi1/2/6 Gi1/2/7 Gi1/2/8 Gi1/2/9 Gi1/2/10 Gi1/2/11 Gi1/2/12
Warning: rcv cosmap will not be applied in hardware.
To modify rcv cosmap in hardware, all of the interfaces below
must be put into ‘trust cos’ state:
Gi1/2/1 Gi1/2/2 Gi1/2/3 Gi1/2/4 Gi1/2/5 Gi1/2/6 Gi1/2/7 Gi1/2/8 Gi1/2/9 Gi1/2/10 Gi1/2/11 Gi1/2/12
Switch(config-if)#wrr-queue cos-map 2 3 6
Propagating cos-map configuration to:  Gi1/2/1 Gi1/2/2 Gi1/2/3 Gi1/2/4 Gi1/2/5 Gi1/2/6 Gi1/2/7 Gi1/2/8 Gi1/2/9 Gi1/2/10 Gi1/2/11 Gi1/2/12
Switch(config-if)#priority-queue cos-map 1 1 5
Propagating cos-map configuration to:  Gi1/2/1 Gi1/2/2 Gi1/2/3 Gi1/2/4 Gi1/2/5 Gi1/2/6 Gi1/2/7 Gi1/2/8 Gi1/2/9 Gi1/2/10 Gi1/2/11 Gi1/2/12

The following things are configured here: CoS 5 is mapped to ingress queue 1, threshold 4. Next, CoS 6 is mapped to egress queue 2, threshold 3. And the third command is the mapping of CoS 5 to the first (and only) priority queue, first threshold.

DSCP to CoS mapping
Mapping CoS to a queue is okay, but what if you’re using DSCP for marking? And what if you have access ports on the 6500? CoS is part of the 802.1q header. For this you can do a DSCP to CoS mapping. For example, to map DSCP EF to CoS 5 and DSCP AF41 to CoS 3:

Switch(config)#mls qos map dscp-cos 46 to 5
Switch(config)#mls qos map dscp-cos 34 to 3

Now packets incoming or remarked on ingress as DSCP EF will be treated as CoS 5 in the queueing.

Bandwidth sharing & random-detect.
There are no shaping commands on the 6500 platform, only sharing of bandwidth. Again, only three values are possible for four queues as the priority queue will just take the bandwidth it needs. You can use shared weights using the ‘wrr-queue bandwidth’ command, but it’s easier to add the ‘percent’ keyword and let it total 100 for a more clear configuration:

Switch(config-if)#wrr-queue bandwidth percent 80 10 10

90% for the first queue, 10% for the two others.

The 6500 platform also supports random early detection in hardware, a function borrowed from routers. It can be activated for a non-priority queue, for example the second queue:

Switch(config-if)#wrr-queue random-detect 2

The thresholds for RED can be modified using the ‘wrr-queue random-detect min-threshold’ and ‘wrr-queue random-detect max-threshold’ commands. They configure the thresholds (eight for the gigabit line card) with a minimum value at which RED starts to work, and a maximum value at which RED starts to drop all packets entering the queue.

Show command
So far I haven’t listed a ‘show’ command. This is because everything you need to know about a certain port is all gathered in one command: ‘show queueing interface’. It’s a command with a very long output, showing the queue buffers, thresholds and drops for both ingress and egress.

The DSCP to CoS mapping is switch-wide, so this is still a separate command:

Switch#show mls qos maps dscp-cos
Dscp-cos map:                                  (dscp= d1d2)
d1:d2 0   1   2   3   4   5   6   7   8   9
——————————————————
0 :    00 00 00 00 00 00 00 00 01 01
1 :    01 01 01 01 01 01 02 02 02 02
2 :    02 02 02 02 03 03 03 03 03 03
3 :    03 03 04 04 03 04 04 04 04 04
4 :    05 05 05 05 05 05 05 05 06 06
5 :    06 06 06 06 06 06 07 07 07 07
6 :    07 07 07 07

Again, d1 is the first digit, d2 the second: for DSCP 46, d1 is 4, d2 is 6.

While in part IV a router used software queues, this is not the case on a switch:

Switch(config-if)#service-policy output PM-Optimize
Warning: Assigning a policy map to the output side of an interface not supported

Why not? Because a switch forwards frames with ASICs, not with the CPU. And that means queueing is done in hardware too. And because the hardware contains a fixed number of queues, configuration is not done with a policy-map, but commands to manipulate these queues directly.

There are both ingress and egress queues, but this article will only explain egress queues, as ingress queueing has little relevance on a 3560/3750 platform. Also, I will only talk about DSCP values and ignore CoS, as this platform can use DSCP end-to-end. This article’s intent is to get a basic understanding of QoS on this platform. For a more detailed approach, this document in the Cisco Support community has proven very useful for me.

Queues and thresholds
The number of egress queues can be checked on a per-port basis:

Switch#show interfaces FastEthernet 0/1 capabilities | include tx|rx
Flowcontrol:              rx-(off,on,desired),tx-(none)
QoS scheduling:        rx-(not configurable on per port basis),
.                                tx-(4q3t) (3t: Two configurable values and one fixed.)

Notice the ‘4q3t’ number: this means the port supports for queues, each with three thresholds. Although the value can be checked on a per-port basis, the 3560/3750 series uses one ASIC for all it’s ports, so the number of queues and thresholds is the same on all ports.

QoS6

The four queues are hard-coded: no more, no less. A queue can be left unused, but no extra queues can be allocated. The thresholds are used for tail drops (the dropping of a frame when the queue is full) and allow to differentiate between traffic flows inside a queue.

An example: the third queue has thresholds at 80%, 90% and 100% (The third threshold is always 100% and can’t be changed). You put packets with DSCP value AF31, AF32 and AF33 in the third queue, but on different thresholds: AF31 on 3, AF32 on 2, AF33 on 1. The consequence is that packets with these DSCP values are put into the queue until the queue is 80% full (the first threshold). At that point, frames of DSCP AF33 are dropped, while the other two are still placed in the queue. If the queue reaches 90%, packets with AF32 are dropped as well, so the remaining 10% of the queue can only be filled with AF31-marked packets.

Each queue also has a buffer: the buffer size determines the amount packets a queue can hold. The allocation of these buffers can be checked:

Wolfberry#show mls qos queue-set 1
Queueset: 1
Queue     :       1       2       3       4
———————————————-
buffers   :        25      25      25      25
threshold1:    100     200    100    100
threshold2:    100     200    100    100
reserved  :      50      50      50      50
maximum   :  400     400    400    400

QoS7

A little explanation: everything is percentages here. The ‘buffers’ line indicates how the buffers are allocated: by default 25% for each queue. The exact amount can’t be found in the data sheets and supposedly depends on the exact type of switch.

The ‘reserved’ line means how much of those buffers are actually guaranteed to the queue. By default 50% of 25%, so 12.5% of the buffer pool is actually reserved for one queue. The other 50% of the total buffer pool can be used for any of the four queues that needs it. If all queues are filled and need it, it ends up at 25-25-25-25 again.

The other three lines are relative to the reserved value. The default first threshold of 100% means traffic set in threshold 1 of queue 1 will be dropped as soon as the queue is filled to its reserved value, 50% of the 25% allocated of the pool. The second threshold in queue 2, default 200%, means the queue will fill up to its allocated value, 100% of the 25% of the buffer pool. The maximum is the implicit third threshold, and is the maximum amount of buffer space that queue can use.

These values can all be changed with ‘mls qos queue-set output’ command. For example, let’s allocate more buffers to the second queue, as the intention is to use this for TCP traffic later on. Also let’s change other parameters: the queue will receive 50% of the buffer pool, 60% of this allocation will be reserved, and thresholds will be at 60% (100% of the reserved value), 90% (150% of the reserved value) and 120% (200% of the reserved value). Queue 1 receives 26% of the buffer pool, queue 3 & 4 each 12%. The thresholds for queue 1 will also change to 80% (160% of the reserved value) and 90% (180% of the reserved value) and 100% (200% of the reserved value).

Switch(config)#mls qos queue-set output 1 buffers 26 50 12 12
Switch(config)#mls qos queue-set output 1 threshold 2 100 150 60 200
Switch(config)#mls qos queue-set output 1 threshold 1 160 180 50 200
Switch(config)#exit
Switch#show mls qos queue-set 1
Queueset: 1
Queue     :       1       2       3       4
———————————————-
buffers   :         26      50      12      12
threshold1:     160     100    100    100
threshold2:     180     150    100    100
reserved  :       50      60      50      50
maximum   :   200     200    400    400

QoS8

Queue mappings
Now that the queues have been properly defined, how do you put packets in them? Well, assuming you’ve marked them as explained in part III, all packets have a DSCP marking. The switch automatically put packets with a certain marking into a certain queue according to the DSCP-to-output-queue table:

Switch#show mls qos maps dscp-output-q
Dscp-outputq-threshold map:
d1 :d2    0       1        2         3         4         5         6         7         8         9
————————————————————
0 :    02-01 02-01 02-01 02-01 02-01 02-01 02-01 02-01 02-01 02-01
1 :    02-01 02-01 02-01 02-01 02-01 02-01 03-01 03-01 03-01 03-01
2 :    03-01 03-01 03-01 03-01 03-01 03-01 03-01 03-01 03-01 03-01
3 :    03-01 03-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01
4 :    01-01 01-01 01-01 01-01 01-01 01-01 01-01 01-01 04-01 04-01
5 :    04-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01 04-01
6 :    04-01 04-01 04-01 04-01

Again an explanation: this table explains which DSCP value maps to which queue and threshold. For example, DSCP 0 (first row, first column) maps to queue 2, threshold 1 (02-01). DSCP 46 (fourth row, sixth column) maps to queue 1, threshold 1 (01-01). To map DSCP values to a certain queue and threshold, use the ‘mls qos srr-queue output dscp-map’ command:

Switch(config)#mls qos srr-queue output dscp-map queue 2 threshold 2 34

This will map packets with DSCP value 34 to queue 2, threshold 2, for example.

Bandwidth sharing and shaping
So queues are properly configured, packets are correctly put into the queues according to DSCP values… Just one more thing: what do the queues do? This is where the Shaped Round Robin mechanism comes into play. It’s one of the few egress QoS configurations that is done on a per-port basis on the 3560/3750 platform. There are two commands: ‘srr-queue bandwidth shape’ and ‘srr-queue bandwidth share’, followed by four values for the four queues.

The ‘shape’ command polices: it gives bandwidth to a queue, but at the same time limits that queue to that bandwidth. Ironically it’s an inverted scale: 25, for example, means 1 in 25 packets, or 4%. 5 means 1 in 5 packets, or 20% bandwidth. If a zero is used, that queue is not shaped. The ‘share’ command does not limit bandwidth and gives it in a relative scale: if the total of the queues is 20 and queue 1 has value 5, that’s 25% bandwidth. If the total is 50 and queue 1 has value 5, that’s 10% bandwidth.

Switch(config-if)#srr-queue bandwidth share 10 150 30 20
Switch(config-if)#srr-queue bandwidth shape 20 0 0 0

The above gives 5% bandwidth to the first queue (one in 20 packets). The other three queues receive 75%, 15% and 10% bandwidth respectively. 150+30+20 is 200 (10 is not counted here, because this queue is already shaped). 150 of 200 is 75%, 30 of 200 is 15%, 20 of 200 is 10%. How the shaped and shared queues are counted together is not clear, after all, 105% of bandwidth is allocated now. But it would require all queues to be filled at the same time to reach this situation.

Low latency queuing
And finally, the 3560/3750 allows for one priority egress queue. If a packet is placed in this queue, it will be sent out next, regardless of what is in the other queues, until it reaches the maximum allowed bandwidth. This makes it ideal for voice and other low-latency traffic. By design, the priority queue has to be the first queue, so the command doesn’t have a number in it:

Switch(config-if)#priority-queue out

Sounds logical, right? You’re correct: absolutely not. Unfortunately, Cisco uses a different type of value system for each QoS command, and the only way getting a feeling for it is trying it out. I hope this does help understand the workings of QoS in hardware. In the next article, we’ll review another platform.

Assuming you’ve marked packets on ingress as detailed in part III, it’s now time to continue to the actual prioritization. First a router: a router, e.g. a 2800 platform, forwards packets using the CPU and uses software queues for prioritization. This means packets are stored in RAM while they are queued, and the router configuration defines how many queues are used and which ones are given priority.

QoS3

This queueing in RAM means that you can customize the number of queues. By default, there is only one queue, using the simple First-in First-out (FIFO) method, but if there is needs for different treatment for other traffic classes, new queues can be allocated. The queues can also be given different parameters. While there’s a large array of commands in a policy-map possible, for basic QoS on ethernet, three commands will do: ‘bandwidth’, ‘police’ and ‘priority’.

Bandwidth
The bandwidth parameter defines what amount of bandwidth a queue is guaranteed. It is configured in Kbps. It does not set a limit: if the interface is not congested, the queue will receive all the bandwidth it needs. But in case of congestion, the bandwidth of the queue will not drop below this configured value.

Router(config)#class-map CM-FTP
Router(config-cmap)#match dscp af12
Router(config-cmap)#exit
Router(config)#policy-map PM-Optimize
Router(config-pmap)#class CM-FTP
Router(config-pmap-c)#bandwidth 10000
Router(config-pmap-c)#exit
Router(config-pmap)#exit
Router(config)#interface Eth0/0
Router(config-if)#service-policy output PM-Optimize
I/f FastEthernet0/0 class CM-FTP requested bandwidth 10000 (kbps), available only 7500 (kbps)

The configuration and error message above does show a weak point: you can easily misjudge the amount of bandwidth available. For this, the ‘bandwidth percent’ command makes it easier. Also, while it’s a 10 Mbps interface, it shows only 7.5 Mbps of available bandwidth. The reason for this is that 75% of the interface bandwidth is used for QoS calculations, and the rest is reserved for control traffic (OSPF, CDP,…). The ‘max-reserved bandwidth’ command on the interface can change this, and a modern high speed interface will have enough with a few percent for control traffic.

Router(config)#policy-map PM-Optimize
Router(config-pmap)#class CM-FTP
Router(config-pmap-c)#bandwidth percent 50
Router(config-pmap-c)#exit
Router(config-pmap)#exit
Router(config)#interface Eth0/0
Router(config-if)#max-reserved bandwidth 90

The above would guarantee a bandwidth of 4.5 Mbps for the class CM-FTP: 90% of the 10 Mbps interface is 9 Mbps, and 50% of that.

Police
The bandwidth guarantee for the ‘police’ command is the same as with the ‘bandwidth’ command. The only difference is that it is a maximum at the same time: even if there is no congestion on the link, bandwidth for the queue will still be limited. It is configured in increments of 8000 bits (no Kbps): configuring ‘police 16200’ will actually configure ‘police 16000’. This can be useful: if there is no congestion, available bandwidth is divided evenly over the queues, except the ones that use policing.

Router(config)#class-map CM-Fixed
Router(config-cmap)#match dscp af13
Router(config-cmap)#exit
Router(config)#policy-map PM-Optimize
Router(config-pmap)#class CM-Fixed
Router(config-pmap-c)#police 32000

Priority
The ‘priority’ command is nearly equal to the bandwidth command. Also measured in Kbps, also a minimum guarantee of bandwidth. The difference is that this queue will always be serviced first, resulting in low-latency queueing. Even if packets are dropped due to congestion, the ones going through will have spent the least amount of time in a queue.

Router(config)#class-map CM-Voice
Router(config-cmap)#match dscp ef
Router(config-cmap)#exit
Router(config)#policy-map PM-Optimize
Router(config-pmap)#class CM-Voice
Router(config-pmap-c)#police 32000

TCP optimization
So far, mainly latency-sensitive traffic like UDP voice has been given priority. But it doesn’t mean optimizations for TCP aren’t possible: a protocol such as FTP or any other TCP protocol that uses windowing starts behaving in a typical pattern on a congested link: windowing up until the point of congestion, losing frames and rewindowing to a smaller value, after which the process starts again.

QoS4

If multiple similar TCP connections are on a link, they tend to converge. When congestion occurs, the queue fills up, packets are eventually dropped and many TCP connections rewindow to a lower value at the same time. The consequence is that the link is suddenly only partially used. It would be better if rewindowing for each flow happens at different times, so there are no sudden drops in total bandwidth usage. This can be achieved by using Random Early Detect (RED): by dropping some packets before the queues are full, some flows will rewindow before the link is 100% full, avoiding further problems. RED starts working after the queue has been filled after a certain percentage, and will only drop one in every x number of packets. A complete explanation of RED would take another article, but a simple and effective starting point is the following configuration:

Router(config)#policy-map PM-Optimize
Router(config-pmap)#class CM-TCP
Router(config-pmap-c)#random-detect dscp-based

The ‘dcsp-based’ parameter is optional, but will cause the router to follow the DSCP markings as explained in part II: AF11 has a lower drop probability than AF13, so packets with value AF13 will be dropped more often compared to AF11.

QoS5

The result is more even distribution of bandwidth, and overall better throughput.

One last command can also help TCP: ‘queue-limit’. While the queue length for a priority queue is best set low, TCP traffic is usually tolerant of latency. It’s better to have it in a queue then have it being dropped.

Router(config)#policy-map PM-Optimize
Router(config-pmap)#class CM-TCP
Router(config-pmap-c)#queue-limit 100

A larger queue in combination with RED allows for a good throughput even with congestion. The default queue length is 64.

So that’s the basics for QoS on a router. Up next: different switch platforms, which all have their own different QoS mechanisms.

This article doesn’t cover anything needed in day-to-day networking, but I’ve found this useful to give me insights into the workings of wireless. Also, although the article is named ‘Wireless signal modulation’, the mechanics are the same for every analog medium that has to transport digital signals, such as coax and DSL.

How does digital data go through the air? For starting, it needs a carrier signal: a certain electromagnetic wavelength that provides a base clock and carries the data through the air.

CarrierWave

This is a basic carrier wave that does not contain any data. Transporting digital data, bits, over this can be done in different forms:

Frequency Modulation
This form of modulation changes the frequency of the carrier wave: a ‘0’ bit for normal frequency, a ‘1’ bit for higher frequency.

FrequencyModulationSo the above would be read as ‘0100’. I’m making an exaggeration here in the picture, displaying the ‘1’ as double the frequency as a ‘0’. In reality, it would be just slightly higher.

Amplitude Modulation
Changing the amplitude, or intensity of the wave, is another form of modulation. A higher energy wave means a ‘1’, so the same ‘0100’ from FM would look like this in AM:

AmplitudeModulation

Phase Modulation
This one might be a bit more difficult to see: each ‘up and down’ from the carrier wave is one phase. By changing the starting point of each phase, it’s possible to encode information in it. Again the same ‘0100’ where a ‘1’ means the phase will start halfway:

PhaseModulation

Quadrature Amplitude Modulation
QAM is what modern wireless uses. It’s a combination of the above. In most cases it’s AM and PM or AM and FM combined. The use of combining these modulations is that you can encode twice as much per phase.

QuadratureAmplitudeModulation

The above is ‘00110000’ in QAM, using AM and PM combined. But so far, it’s only one bit per modulation per phase. This can be extended. For example, by using 4 different wave intensities (amplitudes), two bits at a time can be transmitted (00, 01, 10 and 11 are four combinations). Multiple frequency changes allow for more than one bit as well, and by starting the phase not only halfway, but also at 1/4, or even 1/8, more bits can be carried. For example, QAM-16 looks like this:

QAM16-Stream

Each phase here can carry four bits (a total of 16 combinations) at a time. Each phase here is called a ‘symbol’ because there’s a unique waveform or symbol for each pattern. The first two bits in this example are the amplitude part: ’00’ for a normal wave, ’11’ for the second wave, which have a four times higher amplitude, and so on. The remaining two bits are from the phase: ’00’ for a normal phase, ’01’ for a phase that starts at 1/4.

Modulator and antenna sensitivity
Now, unlike what most people think, a higher quality (and higher priced) access point doesn’t transmit more signal, thus having a better coverage. In most cases it’s just as much as a cheap access point, because of government regulations. In Europe, an access point is not allowed to send out more than 100 mW at 2.4 Ghz. The reason higher quality access points work better is in their ability to more precisely generate the carrier wave, and the higher sensitivity to pick up the signal correctly from other devices.

In reality, the carrier wave isn’t perfect. There are always small deviations from the ideal wave. Within certain thresholds, the receiving station can still correct this. A higher quality access point generates waves that deviate less from the ideal wave, thus making it easier for other devices (laptops,…) to pick up the signal correctly, even over greater distances. In return, these access points have more sensitive antenna’s that can pick up weakened signals that a lower quality access point could not have.

Frequency
The base frequency for most standards is fixed. 802.11b & 802.11g use the 2.4 Ghz band, 802.11a and the new 802.11ac standard use the 5.0 Ghz band. Only exception is 802.11n, the current mainstream standard, which can use both bands. This does have some consequences: while 2.4 Ghz did a better job penetrating through most materials compared to the 5.0 Ghz band, it does work at the same frequency as a microwave oven. Turning a microwave oven on near a laptop does decrease throughput (note: a microwave leaks a small amount of radiation, see this link for study results). My own tests gave a 40% decrease in throughput on 2.4 Ghz wireless at 3 meters distance of a microwave oven.

On the other hand, switching to 5.0 Ghz does often result in a little less coverage (except outside of course). I had about 10% less throughput in my own house when switching to the 5.0 Ghz band. But 10% is not that much and it also depends on what other nearby access points already use. But it does make a point: if you’re using the 2.4 Ghz band right now, there’s no real guarantee that your brand new 802.11ac router in a year or two will give the same coverage everywhere in the house.

OSI Layer 1, part II: fiber

While part I covered copper, fiber standards differ. Also, fiber always uses a dedicated cable for each direction, so it’s always full-duplex. The official fiber standards for Ethernet (using small ‘x’ as a wildcard):

802.3j – 10BASE-F
One letter up from 10BASE-T is the standard for 10 Mbps over fiber. It’s never been widely adopted, most likely because fiber was (is) more expensive compared to existing (telephone) copper wiring so no new investments were done, just to get the same speed.

802.3u – 100BASE-FX
Yes, the same standard as copper, they were defined together. Note that 100BASE-SX products were also made by many vendors, but it was never officially made a standard. It was significantly cheaper compared to 100BASE-FX . The first uses lasers and can go up to 2 km on multi-mode fiber, while the latter often used cheaper LEDs but only went up to 550 m.

802.3z – 1000BASE-X
The gigabit standard for fiber was defined before the copper standard. The standard defines multiple different cables and wavelenghts, but generally speaking it allows multi-mode fiber up to 550 m and single-mode fiber up to 5 km. Longer distances are possible using higher quality fibers.

802.3ae – 10GBASE-xx
The standard defines multiple modes of operation. In multi-mode, most used standards are 10GBASE-SR (400 m) and 10GBASE-LRM (802.3aq, 220 m). Single-mode has 10GBASE-LR (10 km) and 10GBASE-ER (40 km).

802.3ba – 40GBASE-xR4, 100GBASE-xR4 & 100GBASE-xR10
One standard defining two different speeds. For 40 GE, the -xR4 means four different physical wires in each direction are used. These cables have eight or twelve smaller fiber cables inside (in case of twelve cables, four are currently unused), each running at 10 Gbps. Data is spread across these fibers in a sort of ‘layer 1 port-channel’ fashion.
There’s not much information on 100 GE cable types yet. It seems either 10 fiber strands are used in each direction, at 10 Gbps, or 4 fibers at 25 Gbps each.
The distance is the same for both: 100-125 m over multi-mode fiber (depending on the quality: OM3 or OM4) and 10 km over single-mode fiber.

Cable types
There are three different types of cables: multi-mode step index, multi-mode graded index, and single mode fiber.

FiberTypes

Source: Wikipedia

Multi-mode step index is widely used: typically usable up to a few hundred meters, relatively cheap.  The graded index is similar, except due to the different (graded) densities of the glass inside the cable there’s no single reflection surface, but rather a ‘bending’ of the laser inside. This gives less attenuation (weakening) of the signal.
Single mode uses a very small core fiber, so the laser generally follows a more straight path towards the next device. This results in much less attenuation and allows the laser to cross a distance of multiple kilometers.

Propagation speed
A widely accepted thought is that fiber is faster compared to copper, because light propagates at 300,000 km/s and electrical signals at about 200,000 km/s.
However, in a recent session about ultra low latency designs, Lucien Avramov proved this to be a misconception: a typical multi-mode fiber has a refraction index of 1,5 because the lasers bounces (or bends) off the internal surface of the fiber, making the signals propagate at about… 200,000 km/s. Copper and fiber are the same in this regard, with signals travelling at 5 ns (nanoseconds) per meter. Fastest cable? A twinax cable, at 4.3 ns per meter, due to the higher quality metal inside, allowing for faster propagation. However, a twinax is only limited up to 5 meters in passive mode and 10 meters in active mode. Taking into account that a typical copper SFP connector and active twinax connector introduces more latency than a fiber SFP, fiber is still the best option for ultra low latency environments were you need to run more than 5 meters of cable.

Connectors
That’s the speed and cable types, but what about the connectors? Often fiber connectors aren’t present on a networking device, but rather plug-in connectors are used, in most cases hot-swappable. For 100 and 1000 Mbps, on older switches, GBIC connectors are used:

GBIC

These are very wide and take up a lot of space. For this reason, Small Form-Factor Pluggable (SFP) connectors were made, on which mode gigabit fibers (and copper cables too) terminate these days:

SFP

For 10 Gbps, SFP+ modules are used, which look nearly identical to the SFP modules. An SFP module also fits in a SFP+ slot. These SFP and SFP+ interfaces are the same size as typical RJ-45 interfaces, so switches with 24 SFP ports are not uncommon.
40 Gbps currently uses no clearly defined module, but often these are used:

40GE-Breakout

This is a 40 GE cable. The modules are attached to the cables. This is a thick cable as there are 8 or 12 smaller ones inside.
Finding an image of a 100 GE cable proves to be impossible for now, but for comparison an image of the 100 GE module of a Nexus 7000:

N7K-100GE

These are just two ports, yet they cover most of the front panel. Most likely, towards the future, smaller formats will be introduced.

So, theory, starting from the bottom up. In this part, I’ll cover wired Ethernet over UTP standards. The official standards to date are:

802.3i – 10BASE-T
The first widespread standard. Defaults to half-duplex and uses one copper pair for transmitting, and one for receiving. This leaves two of the four copper pairs in a Cat 5 UTP cable unused.
Requires a Cat 3 cable or higher.

802.3u – 100BASE-TX
Second widespread standard, same default half-duplex, same two copper pairs.
Requires a Cat 5 cable or higher, despite only two used pairs. Ironically, this standard introduced duplex autonegotiation, to which 10 Mbps support was added later on.

802.3ab – 1000BASE-T
Third standard, 1 Gbps. Uses all four copper pairs in a wire. Assumes full duplex. Depending on the implementation, it may try to fall back to half duplex if it can detect one of the cable pairs is damaged. Some implementations instead fall back to 100 Mbps, which does not need all pairs, or just don’t bring up the link at all.
Requires a Cat 5 cable or higher, with Cat 5e recommended (Cat 5e is the same as Cat 5, but the technical requirements are enforced more strictly).

802.3an – 10GBASE-T
This standard only uses full duplex. Half duplex is not an option and thus CSMA/CD (Collision Detection) is no longer present. Unlike previous standards, where the required cable could go up to 100 meters, 10GBASE-T has two types of cable: Cat 6, with a maximum of 55 meters in a low-interference environment and 37 meters recommended in a high-interference environment, and Cat 6a, which goes up to the usual 100 meters.

These are the speed standards over UTP.  But how do interfaces negotiate link speed and duplex?

Duplex and speed are determined using fast link pulses (FLP). Despite the name ‘autonegotiation’, it’s not really a negotiation process. Each interface on a link sends out a series of FLPs: 17 pulses, 125 microseconds apart. Between these pulses (at 62.5 microseconds from each FLP) an additional pulse may be present: if present, it’s a ‘1’, if not, it’s a ‘0’. This way, 16 boolean values are sent over the cable. These contain the supported interface modes (10 Mbps, 100 Mbps, half/full duplex). The last bit, when set to ‘1’, means another page will follow, again a series of FLPs. While 100 Mbps interfaces ignore this, 1 Gbps capable interfaces will check the following pages too because these list gigabit and 10-gigabit support. Both sides of the link will then compare capabilities and the highest common capability is chosen.

If one of both sides is set to static configuration and doesn’t send out FLPs, the interface will try to sense a carrier signal and adapt (10 and 100 Mbps have different carrier signals). Duplex mode can’t be sensed and will default to half duplex. For gigabit it’s slightly different: the standard requires autonegotiation. I haven’t found any confirmation in any documentation, but it seems setting the speed and duplex only changes the FLP values.

I originally read some papers about fast link pulses but I can’t find the source URLs anymore. Wikipedia, as often, did provide many details consistent with what I’ve read.

Up next in part II: some more details about fiber!