Linux network cards

Troubleshoot packet loss of Linux network cards

Troubleshoot packet loss of Linux network cards. Ethtool principle introduction and troubleshooting ideas for solving network card packet loss.

Preface

1. Understand the process of receiving packets

Transfer packets received by the NIC to host memory (NIC interacts with the driver)

Notifies the system kernel processing (driver interacts with the Linux kernel)

2. ifconfig explanation

3. How the NIC works

  • The NIC is packetized
  • Collect the NIC packet
  • NIC interrupt handlers
  • Buffer access

4. Packet loss troubleshooting ideas

  • Check the hardware first
  • overruns and buffer size
  • Red Hat’s official solution

 

It has been previously documented that the problem of packet loss caused by soft interruptions caused by LVS network card traffic load is too high, and the performance tuning practice of RPS and RFS network cards is not high in the case of low pressure for ordinary people.troubleshoot packet loss linux - NIC works

troubleshoot packet loss linux – NIC works

 

The topic I want to share this time is more common server network card packet loss phenomenon troubleshooting ideas, if you want to understand point-to-point packet loss solution ideas may involve a wider range, you may wish to refer to the previous article How to use MTR to diagnose network problems, for Linux commonly used network card packet loss analysis tool is naturally ethtool.

Ethtool is used to view and modify driver parameters and hardware settings for network devices, especially wired Ethernet devices.

You can change the parameters of the Ethernet card as needed, including parameters such as auto-negotiation, speed, duplex, and Wake on LAN.

By configuring your Ethernet card, your computer can communicate efficiently over the network.

The tool provides a lot of information about Ethernet devices connected to your Linux system.

1. Understand the process of receiving packets – linux network cards

Receiving a packet is a complex process that involves many of the underlying technical details, but roughly requires the following steps:

  • The NIC receives the packet.
  • Transfer packets from the NIC hardware cache to server memory.
  • Notifies kernel processing.
  • Layered by TCP/IP protocol.
  • The application reads data from the socket buffer via read().

Understand the process of receiving packets - Troubleshoot packet loss of Linux network cards

Understand the process of receiving packets – Troubleshoot packet loss of Linux network cards

 

Transfer packets received by the NIC to host memory (NIC interacts with the driver)

After the NIC receives the packet, it first needs to synchronize the data to the kernel, and the bridge in between is the rx ring buffer.

It is an area shared by the NIC and the driver, and in fact, the rx ring buffer stores not the actual packet data, but a descriptor that points to its actual storage address, as follows:

  • The driver allocates a buffer in memory to receive packets, called sk_buffer;
  • Add the address and size of the above buffer (i.e. the receive descriptor) to the rx ring buffer. The buffer address in the descriptor is the physical address used by DMA;
  • The driver notification network card has a new descriptor;
  • The NIC takes the descriptor from the RX Ring Buffer to learn the address and size of the buffer;
  • The NIC receives a new packet;
  • The NIC writes new packets directly to the sk_buffer via DMA.

Transfer packets received by the NIC to host memory (NIC interacts with the driver)

Transfer packets received by the NIC to host memory (NIC interacts with the driver)

 

When the driver processing speed cannot keep up with the packet collection speed of the network card, the driver does not have time to allocate buffers, and the packets received by the NIC cannot be written to the sk_buffer in time, which will cause accumulation, and when the NIC’s internal buffer is full, part of the data will be discarded, causing packet loss.

This portion of packet loss is rx_fifo_errors, which is reflected in FIFO field growth in /proc/net/dev and overruns indicator growth in ifconfig.

Notifies the system kernel processing (driver interacts with the Linux kernel)

At this point, the packet has been transferred to sk_buffer.

As mentioned earlier, this is a buffer allocated by the driver in memory and written through DMA, which does not rely on the CPU to write data directly to memory, which means that the kernel does not know that new data has been added to memory.

So how do you let the kernel know that new data is coming in? The answer is an interrupt, which tells the kernel that new data has come in and needs to be processed later.

When it comes to interrupts, it involves hard interrupts and soft interrupts, and you first need to briefly understand the difference between them:

Hard interrupt: Generated by the hardware itself, it is random, and after the hard interrupt is received by the CPU, the execution of the interrupt handler is triggered.

Linux WiFi - Linux network cardsLinux WiFi – Linux network cards – NIC works

 

The interrupt handler will only handle the critical work that can be processed in a short time, and the remaining work that takes a long time will be put after the interrupt and completed by a soft interrupt.

Hard interrupts are also known as the top half.

Soft interrupt: generated by the interrupt handler corresponding to the hard interrupt, which is often implemented in the code in advance and is not random.

(In addition, there are application-triggered soft interrupts that are independent of the NIC packets discussed in this article.) Also known as the lower half.

When the NIC copies the packet to kernel buffer sk_buffer via DMA, the NIC initiates a hardware interrupt.

After the CPU receives it, it first enters the upper part, and the interrupt handler corresponding to the network card interrupt is part of the network card driver, and then it initiates a soft interrupt, enters the lower part, and begins to consume the data in sk_buffer and hand it over to the kernel protocol stack for processing.

Through interrupts, it can respond quickly and timely to network card data requests, but if the amount of data is large, a large number of interrupt requests will be generated, and the CPU is busy processing interrupts most of the time, which is inefficient.RTL8188 MT7601 Linux network cards - wifi adapter

RTL8188 MT7601 wifi adapter – Linux network cards

 

In order to solve this problem, the current kernel and drivers use a way called NAPI (new API) for data processing, the principle can be simply understood as interrupt + polling, when the amount of data is large, a certain number of packets are received by polling after an interruption and then returned, avoiding multiple interrupts.

2. ifconfig explanation

[root@localhost ~]
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.135 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 fe80::20c:29ff:fe9b:52d3 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:9b:52:d3 txqueuelen 1000 (Ethernet)
RX packets 833 bytes 61846 (60.3 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 122 bytes 9028 (8.8 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

 

  • RX errors

Indicates the total number of packet errors, including too-long-frames errors, ring buffer overflow errors, crc check errors, frame synchronization errors, fifo overruns and missed pkg, among others.

  • RX dropped

It means that the packet has entered the Ring Buffer, but it is dropped during the copying process due to system reasons such as insufficient memory.

  • RX overruns

Represents the overruns of fifo, which is caused by the Ring Buffer (aka Driver Queue) transmitting more IO than the kernel can handle, while the Ring Buffer refers to the buffer before initiating the IRQ request.Troubleshoot packet loss of Linux network cards - WiFi 5.8GHz + 2.4GHz

Troubleshoot packet loss of Linux network cards – WiFi 5.8GHz + 2.4GHz

 

Obviously, the increase in overruns means that the packet is dropped by the physical layer of the network card before reaching the Ring Buffer, and the CPU cannot even process interrupts is one of the reasons for the Ring Buffer to be full, and the problem machine above is because of the uneven distribution of interruprs (all pressed at core0), and the packet loss caused by not doing affinity.

  • RX frame

Represents misaligned frames.

3. How network cards (WiFi cards) work  –  linux network cards

The NIC is packetized

The NIC driver adds the IP packet to a 14-byte MAC header to form a frame (no CRC). The frame (no CRC) contains the MAC addresses of the sender and receiver, and since the driver creates the MAC header, you can enter the address casually or masquerade the host.

The driver copies the frame (no CRC) to a buffer inside the NIC chip, which is handled by the NIC.

The network card chip encapsulates the incomplete frame (lack of CRC) again into a packet that can be sent, that is, add header synchronization information and CRC check, and then throw it on the network line to complete the sending of an IP report, and all network cards connected to the network line can see the packet.

Collect the NIC packet

The packet on the network line is first obtained by the network card, which checks the CRC check of the packet to ensure integrity, and then removes the packet header to obtain the frame.

The NIC checks the destination MAC address in the MAC packet and discards it if it is different from the MAC address of the NIC (except for promiscuous mode).

The NIC copies the frame to the FIFO buffer inside the NIC, triggering a hardware interrupt.How network cards (WiFi cards) work - BCM957412M4122C-WiFi adapter

How network cards (WiFi cards) work – BCM957412M4122C-WiFi adapter

 

(If there is a ring buffer network card, it seems that the frame can first exist in the ring buffer and then trigger a software interrupt (the next article will explain in detail the direction of frames in Linux), ring buffer is the network card and driver sharing, is the memory in the device, but is visible to the operating system, because see that the network card driver in the Linux kernel source code is using kcalloc to allocate space, so ring buffer There is generally an upper limit, and this ring buffer size should indicate the number of frames that can be stored, not the byte size.

In addition, the ethtool command of some systems cannot change the ring parameters to set the size of the ring buffer, I don’t know why, it may be that the driver does not support it. )

The NIC driver builds a sk_buff through a hard interrupt handling function, copies the frame from the NIC FIFO into the memory skb, and then hands it to the kernel for processing.

(Network cards that support NAPI should be placed directly in the ring buffer, do not trigger hard interrupts, directly use soft interrupts, copy the data in the ring buffer, and directly transmit them to the upper layer for processing, and each network card can process weight frames in a soft interrupt processing process)PCIE WiFi adapter

Linux network cards

 

During the process, the NIC chip MAC-filtered the frame to reduce the system load. (except promiscuous mode)

NIC interrupt handlers – linux network cards

Each device that generates an interrupt has a corresponding interrupt handler that is part of the device driver. Each NIC has an interrupt handler that notifies the NIC that the interrupt has been received and copies the packets from the NIC buffer to memory.

When the NIC receives a packet from the network, it needs to notify the kernel that the packet has arrived. The NIC issues an interrupt immediately.

The kernel responds by executing interrupt handlers registered by the NIC. The interrupt handler starts executing, notifies the hardware, copies the latest network packets to memory, and then reads more packets from the NIC.

These are important, urgent, and hardware-related tasks. The kernel usually needs to copy network packets to system memory quickly, because the cache size of the received network packets on the network card is fixed and much smaller than the system memory.

Therefore, once the above copy action is delayed, it will inevitably cause the FIFO cache overflow of the network card – the incoming packets fill the cache of the network card, and subsequent packets can only be discarded, which should be the source of the overrun in ifconfig.How network cards (WiFi cards) work - Realtek RTL8811 adapter

How network cards (WiFi cards) work – Realtek RTL8811 adapter

 

When the network packet is copied to system memory, the task of the interrupt is completed, and it returns control to the program that ran before the system was interrupted.

Buffer access

The kernel buffer of the network card is in PC memory and is controlled by the kernel, while the network card will have a FIFO buffer, or ring buffer, which should distinguish between the two. FIFOs are relatively small, and if there is data in them, they will try to store the data in the kernel buffer.

The buffer in the NIC is neither kernel nor user space. It belongs to the hardware buffer, allowing a buffer between the network card and the operating system;

Kernel buffers are in kernel space, in memory, used by kernel programs, as data buffers read from or written to hardware;

The user buffer is in user space, in memory, used by the user program, as a data buffer read from or written to the hardware;

In addition, to speed up the interaction of data, kernel buffers can be mapped to user space, so that both kernel and user programs can access this interval at the same time.

For network cards with ring buffers, ring buffers are shared by the driver and network cards, so the kernel can directly access ring buffers, generally copying a copy of frames to its own kernel space for processing (deliver to the upper protocol, and then skb is passed according to skb’s pointer pass until the user obtains data, so for ring buffer network cards, a large number of copies occur in frame slaves The ring buffer is passed to the kernel-controlled computer memory).

4. Packet loss troubleshooting ideas – linux network cards

The network card works at the data link layer, the data volume link layer, and will do some verification and encapsulate into frames.

We can see if there is an error in the validation to determine if there is a problem with the transmission. Then from the software level, whether the buffer is too small to lose packets.

Check the hardware first

A machine often receives an alarm for packet loss, first see if there is a problem at the bottom:

1. Check whether the working mode is normal

 

[root@localhost ~]# ethtool eth0 | egrep 'Speed|Duplex'
Speed: 1000Mb/s
Duplex: Full

 

2. Check to see if the test is normal

 

[root@localhost ~]# ethtool -S eth0 | grep crc
rx_crc_errors: 0

Speed, Duplex, CRC and the like are all fine, and you can basically exclude physical interference.

overruns and buffer size

for i in `seq 1 100`; do ifconfig eth2 | grep RX | grep overruns; sleep 1; done

RX packets:346547657 errors:0 dropped:0 overruns:35345 frame:0

-g –show-ringQueries the specified ethernet device for rx/tx ring parameter information.
-G –set-ringChanges the rx/tx ring parameters of the specified ethernet device.

ethtool -g eth0

[root@localhost ~]
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 256
RX Mini: 0
RX Jumbo: 0
TX: 256

ethtool -G eth0 rx 2048
ethtool -G eth0 tx 2048

[root@localhost ~]
[root@localhost ~]
[root@localhost ~]
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 2048
RX Mini: 0
RX Jumbo: 0
TX: 2048

 

Leave a Reply

Your email address will not be published. Required fields are marked *