fix: 3 c6in.metal bench regressions (AF_XDP fall-back, DPDK ring size, EAL argv split)#5
Conversation
Bug C from anygpt-52 c6in.metal bench (PR 65 issuecomment-4339242358):
the EAL argv splitter previously copied argv[0] (scanner binary path)
into eal_argv[0]. rte_eal_init may rewrite eal_argv slots as it
consumes EAL flags, and the scanner's failure-path log printed
eal_argv after that rewrite — so
scanner --io-engine=dpdk … -- --file-prefix=foo --socket-mem 1024
surfaced in the EAL log as
EAL argv was: scanner --file-prefix=foo --socket-mem scanner
— the trailing 1024 token was clobbered by the scanner program path
threading through rte_eal_init's argv mutation.
Fix:
- Extract split_argv_on_dash_dash from src/main.c into its own TU
(src/eal-argv-split.c + include/eal-argv-split.h) so the splitter
can be unit-tested without dragging the rest of the scanner in.
- eal_argv[0] is now a synthesized stable literal "anyscan-dpdk" —
same shape the existing default-argv fallback in dpdk_eal_bringup
uses, so the EAL prog-name slot is identical between the two
paths and independent of argv lifetime.
- Token order between separators is preserved strictly:
eal_argv[1..N] are exactly the tokens between `--` and end-of-argv,
in order, no replacement.
Test (tests/test_eal_argv_split.c, run via make unit-tests):
- test_no_dash_dash: no `--` → eal_argc=0, eal_argv=NULL.
- test_socket_mem_round_trip: reproduces the bench regression and
asserts eal_argv = [anyscan-dpdk, --file-prefix=foo, --socket-mem,
1024, NULL] with argv[0] not appearing anywhere in eal_argv.
- test_dash_dash_with_no_eal_args: synthesized prog-name still
present even with no user EAL tokens.
- test_realistic_eal_argv_shape: -l 0-7 --socket-mem 1024
--file-prefix=run42 --no-shconf round-trips intact.
Verified RED: temporarily reverting eal_argv[0] = argv[0] makes 4
assertions fail (`expected 'anyscan-dpdk', got 'scanner'`); restoring
the fix returns all 4 to GREEN.
Refs: AnyVM-Tech/AnyGPT#65 issuecomment-4339242358
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug B from anygpt-52 c6in.metal bench (PR 65 issuecomment-4339242358):
src/dpdk-eal.c hardcoded nb_tx_desc=ANYSCAN_DPDK_TX_RING_SIZE=1024
when calling rte_eth_tx_queue_setup. AWS ENA reports
tx_desc_lim.nb_max=512, so the call returned
Invalid value for nb_tx_desc(=1024), should be: <= 512, >= 128, ...
rte_eth_tx_queue_setup(port=0, q=0) failed: Invalid argument
before any TX traffic could leave the box. Same shape on the RX side.
Fix:
- New pure helper dpdk_clamp_ring_size in src/dpdk-ring-clamp.c that
clamps a requested ring size against the three constraints in
struct rte_eth_desc_lim: nb_max, nb_min, nb_align (any field
reported as 0 means the PMD did not advertise that constraint and
is skipped).
- dpdk_eal_bringup now calls the helper using
dev_info.{tx,rx}_desc_lim from the existing rte_eth_dev_info_get
call before each rte_eth_*queue_setup, and logs the clamp at
info level so the operator sees the active ring size.
- include/dpdk-defs.h documents the per-PMD cap (ENA: nb_max=512;
i40e/ixgbe/mlx5: 4096..8192) so the constant choice is self-
explanatory.
Test (tests/test_dpdk_clamp.c, run via make unit-tests):
- test_ena_{tx,rx}_clamps_1024_to_512: the bench regression repro.
- test_in_range_no_clamp: 512/256/128 against ENA limits → no change.
- test_below_min_bumps_to_min: 64 → 128.
- test_no_limits_passthrough: PMD reports all-zero limits → no clamp.
- test_alignment_rounds_down: align=64 with max=600 → 576.
- test_alignment_rounds_up_when_needed: align=64 with min=200 →
rounds 200 up to 256 (the only valid value).
Verified RED: pre-source the test fails to link with implicit-function-
declaration and missing src/dpdk-ring-clamp.c. Adding the source +
header declaration drives 7/7 to GREEN.
Build verification on a host with libdpdk-dev installed:
make USE_DPDK=1 test → 11/11 io_engine_dispatch + 8/8 dpdk_dispatch
+ 2/2 unit tests pass.
Refs: AnyVM-Tech/AnyGPT#65 issuecomment-4339242358
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug A from anygpt-52 c6in.metal bench (PR 65 issuecomment-4339242358):
on AWS ENA the AF_XDP fall-back ladder
(drv+zerocopy → drv+copy → skb) segfaults on the second attempt.
The first xsk_socket__create returns -EOPNOTSUPP from bind() *after*
libxdp has already attached its xsks_map redirect program in DRV mode;
the next attempt's xsk_socket__create then segfaults inside libxdp's
program-management layer trying to manage a stale program slot. This
made --io-engine=af_xdp untestable on c6in.metal.
Repro:
scanner -i ens1 --io-engine=af_xdp -T 4 -R 4 198.18.0.0/24 1-1024
Fix: every failed attempt fully tears down what libxdp may have set up
before returning, so the next attempt sees a fully clean interface.
- New helper afxdp_full_teardown_after_failed_bind that releases the
four primitives required for clean state:
1. xsk_socket__delete (release partially-bound socket)
2. xsk_umem__delete (release UMEM rings)
3. bpf_xdp_detach (mode=0) (clear interface XDP program slot,
regardless of DRV vs SKB attach)
4. zero out s->{xsk, umem, xsk_fd, free_*, bound_mode, rings}
plus free of the per-thread heap (free_stack, umem_area).
- afxdp_try_bind now owns UMEM lifetime per attempt: allocates fresh
via afxdp_alloc_umem at the top, calls the teardown helper on
xsk_socket__create failure, returns -1.
- afxdp_tx_init_per_thread no longer pre-allocates UMEM outside the
bind ladder — the bind helper handles it. The redundant teardown
in the "ladder exhausted" branch is gone (already torn down by the
last failed attempt; only the outer struct itself remains to free).
Includes pulled in: <bpf/libbpf.h> (bpf_xdp_detach), <net/if.h>
(if_nametoindex). The libxdp/libbpf -dev packages already provide both;
no Makefile change needed since the AF_XDP build links libbpf via
pkg-config.
Test (tests/test_afxdp_teardown.sh, run via make unit-tests):
Structural assertions on src/send-afxdp.c — libxdp behaviour requires
a NIC + root to exercise live, so this verifies that the source
contains the teardown primitives in the right places. A future
regression that removes any of them fails this test without needing
to reproduce the segfault on hardware. 10 assertions across 4
categories: helper exists, helper calls all four primitives + state
zero, afxdp_try_bind dispatches into the helper, single
afxdp_alloc_umem call site (proving the UMEM allocation moved
inside try_bind).
Verified RED: running the structural test against the pre-fix
src/send-afxdp.c (HEAD~2) fails 9 of the 10 assertions; restoring the
fix returns all 10 to GREEN.
Build verification on a host with libxdp+libbpf installed:
make USE_AF_XDP=1 USE_DPDK=1 test → 11/11 io_engine_dispatch +
8/8 dpdk_dispatch + 3/3 unit tests pass. No new compile warnings.
Refs: AnyVM-Tech/AnyGPT#65 issuecomment-4339242358
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f04c313171
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| * Best-effort: the rc is intentionally ignored — if no | ||
| * program is attached, bpf_xdp_detach returns -ENOENT which | ||
| * is the desired no-op. */ | ||
| (void)bpf_xdp_detach(ifindex, 0, NULL); |
There was a problem hiding this comment.
Avoid detaching shared XDP program on bind retry failure
Calling bpf_xdp_detach(ifindex, 0, NULL) in the retry teardown removes whatever XDP program is attached to the interface, not just state from the failed attempt. In this codebase, run_scan starts sender threads as it iterates (src/engine.c), so a later sender’s failed zerocopy attempt can detach the program that earlier senders are already using, causing reply drops or broken AF_XDP receive routing mid-initialization; the same path also removes any pre-existing non-scanner XDP program on that interface.
Useful? React with 👍 / 👎.
Summary
Three engine-side regressions surfaced by the anygpt-52 c6in.metal AWS bench (see AnyVM-Tech/AnyGPT#65 (comment)) — both
--io-engine=af_xdpand--io-engine=dpdkwere untestable on c6in.metal until these landed. Fixes are split across three focused commits, each with unit / structural test coverage.Bug A — AF_XDP fall-back segfault (
fix(afxdp): full per-attempt teardown in bind-mode ladder)src/send-afxdp.c::afxdp_try_binddid not tear down libxdp state between bind-mode attempts. On AWS ENA the ZEROCOPY attempt returns-EOPNOTSUPPfrombind()after libxdp has already attached its xsks_map redirect program in DRV mode; the next attempt'sxsk_socket__createthen segfaults inside libxdp's program-management layer.Repro:
scanner -i ens1 --io-engine=af_xdp -T 4 -R 4 198.18.0.0/24 1-1024— segfault on every run.Fix: new
afxdp_full_teardown_after_failed_bindhelper that releasesxsk_socket__delete+xsk_umem__delete+bpf_xdp_detach(mode=0 to clear regardless of DRV/SKB attach) + zeros all per-thread state. UMEM allocation moved insideafxdp_try_bindso each attempt fully reconstructs UMEM + socket + program attachment.Coverage:
tests/test_afxdp_teardown.sh— 10 structural assertions onsrc/send-afxdp.c. Verified RED onHEAD~3:src/send-afxdp.c(9 of 10 assertions fail), GREEN on the fix.Bug B — DPDK
nb_tx_desc=1024exceeds ENAnb_max=512(fix(dpdk): clamp TX/RX descriptor ring size against PMD nb_max)src/dpdk-eal.chardcodedANYSCAN_DPDK_TX_RING_SIZE=1024when callingrte_eth_tx_queue_setup. AWS ENA reportstx_desc_lim.nb_max=512:Fix: new pure helper
dpdk_clamp_ring_size(requested, dev_max, dev_min, dev_align)insrc/dpdk-ring-clamp.cthat honors all three constraints fromstruct rte_eth_desc_lim.dpdk_eal_bringupcalls it ondev_info.{tx,rx}_desc_limfrom the existingrte_eth_dev_info_getand logs the clamp at info level. The cap is documented ininclude/dpdk-defs.h.Coverage:
tests/test_dpdk_clamp.c— 7 unit tests covering the bench repro (1024→512), in-range no-clamp, below-min bump-up, no-limits passthrough, alignment round-down, and alignment round-up-when-needed.Bug C — EAL argv splitter mangles
--socket-mem 1024(fix(dpdk): synthesize stable EAL prog-name; never inject argv[0])src/main.cpreviously copiedargv[0](scanner binary path) intoeal_argv[0].rte_eal_initmay rewriteeal_argvslots as it consumes EAL flags, and the failure-path log printedeal_argvafter that rewrite — so the trailing--socket-mem 1024value token surfaced as--socket-mem scannerin the EAL log.Fix: extract
split_argv_on_dash_dashinto its own TU (src/eal-argv-split.c+include/eal-argv-split.h) so it is unit-testable.eal_argv[0]is now the synthesized stable literal\"anyscan-dpdk\"(matching the existing default-argv fallback indpdk_eal_bringup). Tokens between--and end-of-argv are preserved strictly, in order.Coverage:
tests/test_eal_argv_split.c— 4 unit tests including the--socket-mem 1024round-trip repro that assertsargv[0]does not appear anywhere ineal_argv.What's NOT in this PR
No live c6in.metal bench rerun — that lives in the AnyGPT/anyscan side of the repo (anygpt-52). Once these merge and the worker bundle picks up the new engine commit, the bench can re-run with all three io_engines.
Test plan
make(default build) → builds clean, no new warnings.make USE_AF_XDP=1→ builds clean.make USE_DPDK=1→ builds clean.make USE_AF_XDP=1 USE_DPDK=1→ builds clean.make USE_AF_XDP=1 USE_DPDK=1 test→ 11/11 io_engine_dispatch + 8/8 dpdk_dispatch + 3/3 unit tests pass.Refs: AnyVM-Tech/AnyGPT#65 (comment) AnyVM-Tech/AnyGPT#65 (comment)
🤖 Generated with Claude Code