Tuesday 28 June 2016

Diagnosing Cisco switch performance issues

Cisco switches make use of ASIC hardware to quickly switches both packets and frames - for example ACL and QoS tables are cached into the hardware in the form of tables - the switch will then query the tables when it needs to make decisions - such as evaluate whether a source IP matches a specific ACL in a table.

In contrast to RAM - CAM (Content Addressable Memory) uses data to retrieve where the data is stored - while RAM does the opposite - retrieving data from a memory address.

The CAM table is used primarily for layer 2 switching and returns either 'true' (0) or 'false' (1) - e.g. if data is being sent to another host on the switch the MAC address will be looked up in the CAM table and the result is either true (returns the switch port) or false (flood the switch ports.)

TCAM (Ternary Content Addressable Memory) is like CAM - but it can store a third state which can be any value - hence is useful for storing layer 3 operations like switching and qos.

When the TCAM cable is full it will 'punt' further entries to the CPU.

Check overall process CPU usage and sort by CPU utilization:
show proc cpu sorted

Example output:

CPU utilization for five seconds: 44%/30%; one minute: 42%; five minutes: 32%
 PID Runtime(ms)     Invoked      uSecs   5Sec   1Min   5Min TTY Process
  96          88      147299          0  1.11%  1.04%  0.92%   0 Ethernet Msec Ti
 117          40       36582          1  0.15%  0.19%  0.17%   0 IPAM Manager  
 240          28       36535          0  0.15%  0.14%  0.12%   0 MMON MENG

** Note: The first percentage figure (44%) show the overall CPU utilization and the second figure shows the percentage of the prior figure of which is caused by traffic (30%) **

The following processes are used for handling punted packets to the CPU and hence can be used as an indication of whether punted packets are causing CPU perofrmance issues:

- HLFM address lea
- Check heaps
- Virtual exec
- RedEarth Tx Mana
- HPM counter proc

Some common causes for traffic being punted to the CPU are:

- ACL logging
- Broadcast storms
- TCAM table space used up

To generate a smaller report of the overall CPU usage I like to use:

show proc cpu extended

Check CPU usage on IOS threads (does not work on 2960 series)
show proc cpu detailed process iosd sorted

We can show packet counts for all CPU receive queues with:

show controllers cpu-interface

We can issue the following to get an overview of the TCAM utilization:

show platform tcam utilization

Checking the device memory:

show processes memory sorted

show processes memory detailed process iosd sorted

Sources: http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst3750/software/troubleshooting/cpu_util.html#pgfId-1028395
https://supportforums.cisco.com/document/60831/cam-content-addressable-memory-vs-tcam-ternary-content-addressable-memory

0 comments:

Post a Comment