Tuesday 20 June 2017

Packet Capture: Fine tuning Linux for 10gb NIC's / busy networks

Below I have outlined some of the more important tweaks that can be applied on a Linux system in order to optimise performance with 10gb NICs and busy networks where there is a high volume of throughput.

As a fornote when capturing packets with 10gb cards you should also ensure that you have a sufficient CPU and available IOPS - I'd recommend an SSD for best performance.

Hardware

While libpcap will work with pretty much any NIC if you want to use PF_RING (which is strongly recommended due to performance benefits) you will need an Intel 82599-based NIC and ensure the Linux kernel is above 2.6.31 (which should be pretty much every mainstream distribution these days.)

There are also other specialist NIC's that are supported and can also perform hardware packet filtering - however for the purposes of this tutorial we will be sticking with an Intel based chip.

Tuning

Firstly we should run a network performance tool - such as iperf to benchmark throughput:

sudo yum install iperf

and on the server side issue:

iperf -s

and the client side:

iperf -c server.ip.address -w64k -t60

You'll also want to monitor the cpu during this period e.g.:

mpstat 5

This will also provide us with something to contrast performance with when we have finished performing the tweaks.

RX Descriptor Sizes

The descriptors do not hold any packet data - rather contain information about the whereabouts of the data is in memory. These values are often not set at the maximum - in order to verify your current descriptor levels you can run:

ethtool -g eth0

Example output:

Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 2048
TX: 4096
Current hardware settings:
RX: 256
RX Mini: 0
RX Jumbo: 128
TX: 512

We can then increase the descriptors as follows:

ethtool -G eth0 rx 4096 tx 4096

Jumbo Frames

One of the obvious considerations is enabling Jumbo frames on the interface - although this is presuming that the application(s) support them! We can enable this on a per interface level with:

vi /etc/sysconfig/network-scripts/eth0

and append / change:

MTU=9000

sudo service network restart


RX and TX Checksum Offload

Each time a packet is received or sent the CPU calculates a checksum - enabling this feature forces the NIC to calculate this instead - hence freeing up CPU.

This can be enabled on a per interface level with:

ethtool --offload eth0 tx on rx on

* Note: Saving CPU with TX checksum offload is dependant on how large the frame packet sizes are - larger packets equate to a greater saving.

Kernel Tweaking

Removing TCP time-stamping is another way to reduce CPU load - however you (obviously) lose the round trip time of the segment:

sysctl -w net.ipv4.tcp_ timestamps=0

And increasing the syn and network driver backlog with:

net.ipv4.tcp_max_syn_backlog = 4096
net.core.netdev_max_backlog = 2500

and tcp read, write limits:

net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

and socket buffer space limits:

net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

and backlogged sockets with (default is 100):

net.core.somaxconn = 1024

Sources

https://www.kernel.org/doc/ols/2009/ols2009-pages-169-184.pdf

0 comments:

Post a Comment