Skip to content

Why is the XDP_TX packet dropped by the kernel? #5

@someonebw

Description

@someonebw

Jamesits 你好,我又来了:)

测试环境:
debian11

iproute2/oldstable,now 5.10.0-4 amd64 [installed,automatic]

root@debian11:~# cat /boot/config-5.10.0-25-cloud-amd64 |grep BTF
CONFIG_DEBUG_INFO_BTF=y

root@debian11:~# ip -V
ip utility, iproute2-5.9.0, libbpf 0.3.0

gre tunnel config:
local ip:172.19.92.248
remote ip:172.19.100.254

1.//bpftool prog traclog的结果(libbpf测试过0.0.8的版本,1.2.0的版本,情况一样)
处理in的数据包 ,然后截取head后,xdp_tx 。

      <idle>-0       [001] d.s. 69839.759732: bpf_trace_printk: New packet

      <idle>-0       [001] dNs. 69839.759767: bpf_trace_printk: Packet header dump:

      <idle>-0       [001] dNs. 69839.759770: bpf_trace_printk: #0: 45

      <idle>-0       [001] dNs. 69839.759771: bpf_trace_printk: #1: c0

      <idle>-0       [001] dNs. 69839.759772: bpf_trace_printk: #2: 0

      <idle>-0       [001] dNs. 69839.759773: bpf_trace_printk: #3: 30

      <idle>-0       [001] dNs. 69839.759774: bpf_trace_printk: #4: a

      <idle>-0       [001] dNs. 69839.759775: bpf_trace_printk: #5: 1f

      <idle>-0       [001] dNs. 69839.759775: bpf_trace_printk: #6: 0

      <idle>-0       [001] dNs. 69839.759776: bpf_trace_printk: #7: 0

      <idle>-0       [001] dNs. 69839.759777: bpf_trace_printk: #8: ff

      <idle>-0       [001] dNs. 69839.759778: bpf_trace_printk: #9: 2f

      <idle>-0       [001] dNs. 69839.759779: bpf_trace_printk: #10: 96

      <idle>-0       [001] dNs. 69839.759780: bpf_trace_printk: #11: a2

      <idle>-0       [001] dNs. 69839.759781: bpf_trace_printk: #12: ac

      <idle>-0       [001] dNs. 69839.759782: bpf_trace_printk: #13: 13

      <idle>-0       [001] dNs. 69839.759782: bpf_trace_printk: #14: 64

      <idle>-0       [001] dNs. 69839.759783: bpf_trace_printk: #15: fe

      <idle>-0       [001] dNs. 69839.759784: bpf_trace_printk: #16: ac

      <idle>-0       [001] dNs. 69839.759785: bpf_trace_printk: #17: 13

      <idle>-0       [001] dNs. 69839.759786: bpf_trace_printk: #18: 5c

      <idle>-0       [001] dNs. 69839.759786: bpf_trace_printk: #19: f8

      <idle>-0       [001] dNs. 69839.759787: bpf_trace_printk: #20: 0

      <idle>-0       [001] dNs. 69839.759788: bpf_trace_printk: #21: 0

      <idle>-0       [001] dNs. 69839.759789: bpf_trace_printk: #22: 8

      <idle>-0       [001] dNs. 69839.759789: bpf_trace_printk: #23: 0

      <idle>-0       [001] dNs. 69839.759790: bpf_trace_printk: #24: 45

      <idle>-0       [001] dNs. 69839.759791: bpf_trace_printk: #25: c0

      <idle>-0       [001] dNs. 69839.759792: bpf_trace_printk: #26: 0

      <idle>-0       [001] dNs. 69839.759793: bpf_trace_printk: #27: 18

      <idle>-0       [001] dNs. 69839.759793: bpf_trace_printk: #28: a

      <idle>-0       [001] dNs. 69839.759794: bpf_trace_printk: #29: 1e

      <idle>-0       [001] dNs. 69839.759795: bpf_trace_printk: #30: 0

      <idle>-0       [001] dNs. 69839.759796: bpf_trace_printk: #31: 0

      <idle>-0       [001] dNs. 69839.759797: bpf_trace_printk: Outer GRE flags=0x0 proto=8

      <idle>-0       [001] dNs. 69839.759799: bpf_trace_printk: IPv4 packet_size=0x14, proto=0x2f

      <idle>-0       [001] dNs. 69839.759800: bpf_trace_printk: Inner is GRE4, proto=0

      <idle>-0       [001] dNs. 69839.759801: bpf_trace_printk: GRE4 keepalive received!

但是。
#xdp_tx的包,有问题,导致gre2卡的dropped队列增加。

gre2: flags=209<UP,POINTOPOINT,RUNNING,NOARP> mtu 1476
inet 6.6.6.2 netmask 255.255.255.0 destination 6.6.6.2
inet6 fe80::5efe:ac13:5cf8 prefixlen 64 scopeid 0x20
unspec AC-13-5C-F8-00-00-00-55-00-00-00-00-00-00-00-00 txqueuelen 1000 (UNSPEC)
RX packets 13839 bytes 332136 (324.3 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 36 bytes 2240 (2.1 KiB)
TX errors 0 dropped 9457 overruns 0 carrier 0 collisions 0

3.通过nettrace 对核心丢包,进行分析,如下

//[69535.293726] [enqueue_to_backlog ] ether protocol: 44051

***************** ffff88fec0aeb900 ***************
[69535.293595] [__netif_receive_skb_core] GRE: 172.19.100.254 -> 172.19.92.248
[69535.293618] [ip_rcv_core ] GRE: 172.19.100.254 -> 172.19.92.248
[69535.293631] [ip_route_input_slow ] GRE: 172.19.100.254 -> 172.19.92.248
[69535.293661] [fib_validate_source ] GRE: 172.19.100.254 -> 172.19.92.248
[69535.293674] [ip_local_deliver ] GRE: 172.19.100.254 -> 172.19.92.248
[69535.293684] [ip_local_deliver_finish] GRE: 172.19.100.254 -> 172.19.92.248
[69535.293726] [enqueue_to_backlog ] ether protocol: 44051
[69535.293740] [__netif_receive_skb_core] ether protocol: 44051
[69535.293754] [netif_receive_generic_xdp] ether protocol: 44051
[69535.293904] [kfree_skb ] ether protocol: 44051

***************** ffff88fec0aeb900 ***************

xdpdump 抓包的数据如下:

XDP_TX的是
异常数据?

这里type是ac 13(十进制的44051)

0000 45 c0 00 30 aa f4 00 00 ff 2f f5 cc ac 13 64 fe
0010 ac 13 5c f8 00 00 08 00 45 c0 00 18 aa f3 00 00
0020 ff 2f f5 e5 ac 13 5c f8 ac 13 64 fe 00 00 00 00

5.问题:为什么
if (bpf_xdp_adjust_head(ctx, (int)(cutoff_pos - data_start))) return -1;
action = XDP_TX;
拆解head后的包,是异常包。
导致kernel处理的时候,作为异常包给直接drop了。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions