Skip to content

fix: 3 c6in.metal bench regressions (AF_XDP fall-back, DPDK ring size, EAL argv split)#5

Merged
skullcrushercmd merged 3 commits into
mainfrom
fix/c6in-metal-bench-regressions
Apr 28, 2026
Merged

fix: 3 c6in.metal bench regressions (AF_XDP fall-back, DPDK ring size, EAL argv split)#5
skullcrushercmd merged 3 commits into
mainfrom
fix/c6in-metal-bench-regressions

Conversation

@skullcrushercmd
Copy link
Copy Markdown

Summary

Three engine-side regressions surfaced by the anygpt-52 c6in.metal AWS bench (see AnyVM-Tech/AnyGPT#65 (comment)) — both --io-engine=af_xdp and --io-engine=dpdk were untestable on c6in.metal until these landed. Fixes are split across three focused commits, each with unit / structural test coverage.

Bug A — AF_XDP fall-back segfault (fix(afxdp): full per-attempt teardown in bind-mode ladder)

src/send-afxdp.c::afxdp_try_bind did not tear down libxdp state between bind-mode attempts. On AWS ENA the ZEROCOPY attempt returns -EOPNOTSUPP from bind() after libxdp has already attached its xsks_map redirect program in DRV mode; the next attempt's xsk_socket__create then segfaults inside libxdp's program-management layer.

Repro: scanner -i ens1 --io-engine=af_xdp -T 4 -R 4 198.18.0.0/24 1-1024 — segfault on every run.

Fix: new afxdp_full_teardown_after_failed_bind helper that releases xsk_socket__delete + xsk_umem__delete + bpf_xdp_detach (mode=0 to clear regardless of DRV/SKB attach) + zeros all per-thread state. UMEM allocation moved inside afxdp_try_bind so each attempt fully reconstructs UMEM + socket + program attachment.

Coverage: tests/test_afxdp_teardown.sh — 10 structural assertions on src/send-afxdp.c. Verified RED on HEAD~3:src/send-afxdp.c (9 of 10 assertions fail), GREEN on the fix.

Bug B — DPDK nb_tx_desc=1024 exceeds ENA nb_max=512 (fix(dpdk): clamp TX/RX descriptor ring size against PMD nb_max)

src/dpdk-eal.c hardcoded ANYSCAN_DPDK_TX_RING_SIZE=1024 when calling rte_eth_tx_queue_setup. AWS ENA reports tx_desc_lim.nb_max=512:

Invalid value for nb_tx_desc(=1024), should be: <= 512, >= 128, ...
rte_eth_tx_queue_setup(port=0, q=0) failed: Invalid argument

Fix: new pure helper dpdk_clamp_ring_size(requested, dev_max, dev_min, dev_align) in src/dpdk-ring-clamp.c that honors all three constraints from struct rte_eth_desc_lim. dpdk_eal_bringup calls it on dev_info.{tx,rx}_desc_lim from the existing rte_eth_dev_info_get and logs the clamp at info level. The cap is documented in include/dpdk-defs.h.

Coverage: tests/test_dpdk_clamp.c — 7 unit tests covering the bench repro (1024→512), in-range no-clamp, below-min bump-up, no-limits passthrough, alignment round-down, and alignment round-up-when-needed.

Bug C — EAL argv splitter mangles --socket-mem 1024 (fix(dpdk): synthesize stable EAL prog-name; never inject argv[0])

src/main.c previously copied argv[0] (scanner binary path) into eal_argv[0]. rte_eal_init may rewrite eal_argv slots as it consumes EAL flags, and the failure-path log printed eal_argv after that rewrite — so the trailing --socket-mem 1024 value token surfaced as --socket-mem scanner in the EAL log.

Fix: extract split_argv_on_dash_dash into its own TU (src/eal-argv-split.c + include/eal-argv-split.h) so it is unit-testable. eal_argv[0] is now the synthesized stable literal \"anyscan-dpdk\" (matching the existing default-argv fallback in dpdk_eal_bringup). Tokens between -- and end-of-argv are preserved strictly, in order.

Coverage: tests/test_eal_argv_split.c — 4 unit tests including the --socket-mem 1024 round-trip repro that asserts argv[0] does not appear anywhere in eal_argv.

What's NOT in this PR

No live c6in.metal bench rerun — that lives in the AnyGPT/anyscan side of the repo (anygpt-52). Once these merge and the worker bundle picks up the new engine commit, the bench can re-run with all three io_engines.

Test plan

  • make (default build) → builds clean, no new warnings.
  • make USE_AF_XDP=1 → builds clean.
  • make USE_DPDK=1 → builds clean.
  • make USE_AF_XDP=1 USE_DPDK=1 → builds clean.
  • make USE_AF_XDP=1 USE_DPDK=1 test → 11/11 io_engine_dispatch + 8/8 dpdk_dispatch + 3/3 unit tests pass.
  • Each fix's test verified RED on the pre-fix code path before being driven to GREEN by the patch (per-commit notes in messages).
  • Live c6in.metal bench (separate worker; tracked in the AnyGPT-side anygpt-52 follow-up).

Refs: AnyVM-Tech/AnyGPT#65 (comment) AnyVM-Tech/AnyGPT#65 (comment)

🤖 Generated with Claude Code

skullcmd and others added 3 commits April 28, 2026 22:37
Bug C from anygpt-52 c6in.metal bench (PR 65 issuecomment-4339242358):
the EAL argv splitter previously copied argv[0] (scanner binary path)
into eal_argv[0]. rte_eal_init may rewrite eal_argv slots as it
consumes EAL flags, and the scanner's failure-path log printed
eal_argv after that rewrite — so

    scanner --io-engine=dpdk … -- --file-prefix=foo --socket-mem 1024

surfaced in the EAL log as

    EAL argv was: scanner --file-prefix=foo --socket-mem scanner

— the trailing 1024 token was clobbered by the scanner program path
threading through rte_eal_init's argv mutation.

Fix:
  - Extract split_argv_on_dash_dash from src/main.c into its own TU
    (src/eal-argv-split.c + include/eal-argv-split.h) so the splitter
    can be unit-tested without dragging the rest of the scanner in.
  - eal_argv[0] is now a synthesized stable literal "anyscan-dpdk" —
    same shape the existing default-argv fallback in dpdk_eal_bringup
    uses, so the EAL prog-name slot is identical between the two
    paths and independent of argv lifetime.
  - Token order between separators is preserved strictly:
    eal_argv[1..N] are exactly the tokens between `--` and end-of-argv,
    in order, no replacement.

Test (tests/test_eal_argv_split.c, run via make unit-tests):
  - test_no_dash_dash: no `--` → eal_argc=0, eal_argv=NULL.
  - test_socket_mem_round_trip: reproduces the bench regression and
    asserts eal_argv = [anyscan-dpdk, --file-prefix=foo, --socket-mem,
    1024, NULL] with argv[0] not appearing anywhere in eal_argv.
  - test_dash_dash_with_no_eal_args: synthesized prog-name still
    present even with no user EAL tokens.
  - test_realistic_eal_argv_shape: -l 0-7 --socket-mem 1024
    --file-prefix=run42 --no-shconf round-trips intact.

Verified RED: temporarily reverting eal_argv[0] = argv[0] makes 4
assertions fail (`expected 'anyscan-dpdk', got 'scanner'`); restoring
the fix returns all 4 to GREEN.

Refs: AnyVM-Tech/AnyGPT#65 issuecomment-4339242358

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug B from anygpt-52 c6in.metal bench (PR 65 issuecomment-4339242358):
src/dpdk-eal.c hardcoded nb_tx_desc=ANYSCAN_DPDK_TX_RING_SIZE=1024
when calling rte_eth_tx_queue_setup. AWS ENA reports
tx_desc_lim.nb_max=512, so the call returned

    Invalid value for nb_tx_desc(=1024), should be: <= 512, >= 128, ...
    rte_eth_tx_queue_setup(port=0, q=0) failed: Invalid argument

before any TX traffic could leave the box. Same shape on the RX side.

Fix:
  - New pure helper dpdk_clamp_ring_size in src/dpdk-ring-clamp.c that
    clamps a requested ring size against the three constraints in
    struct rte_eth_desc_lim: nb_max, nb_min, nb_align (any field
    reported as 0 means the PMD did not advertise that constraint and
    is skipped).
  - dpdk_eal_bringup now calls the helper using
    dev_info.{tx,rx}_desc_lim from the existing rte_eth_dev_info_get
    call before each rte_eth_*queue_setup, and logs the clamp at
    info level so the operator sees the active ring size.
  - include/dpdk-defs.h documents the per-PMD cap (ENA: nb_max=512;
    i40e/ixgbe/mlx5: 4096..8192) so the constant choice is self-
    explanatory.

Test (tests/test_dpdk_clamp.c, run via make unit-tests):
  - test_ena_{tx,rx}_clamps_1024_to_512: the bench regression repro.
  - test_in_range_no_clamp: 512/256/128 against ENA limits → no change.
  - test_below_min_bumps_to_min: 64 → 128.
  - test_no_limits_passthrough: PMD reports all-zero limits → no clamp.
  - test_alignment_rounds_down: align=64 with max=600 → 576.
  - test_alignment_rounds_up_when_needed: align=64 with min=200 →
    rounds 200 up to 256 (the only valid value).

Verified RED: pre-source the test fails to link with implicit-function-
declaration and missing src/dpdk-ring-clamp.c. Adding the source +
header declaration drives 7/7 to GREEN.

Build verification on a host with libdpdk-dev installed:
  make USE_DPDK=1 test → 11/11 io_engine_dispatch + 8/8 dpdk_dispatch
  + 2/2 unit tests pass.

Refs: AnyVM-Tech/AnyGPT#65 issuecomment-4339242358

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bug A from anygpt-52 c6in.metal bench (PR 65 issuecomment-4339242358):
on AWS ENA the AF_XDP fall-back ladder
(drv+zerocopy → drv+copy → skb) segfaults on the second attempt.
The first xsk_socket__create returns -EOPNOTSUPP from bind() *after*
libxdp has already attached its xsks_map redirect program in DRV mode;
the next attempt's xsk_socket__create then segfaults inside libxdp's
program-management layer trying to manage a stale program slot. This
made --io-engine=af_xdp untestable on c6in.metal.

Repro:
  scanner -i ens1 --io-engine=af_xdp -T 4 -R 4 198.18.0.0/24 1-1024

Fix: every failed attempt fully tears down what libxdp may have set up
before returning, so the next attempt sees a fully clean interface.

  - New helper afxdp_full_teardown_after_failed_bind that releases the
    four primitives required for clean state:
      1. xsk_socket__delete       (release partially-bound socket)
      2. xsk_umem__delete         (release UMEM rings)
      3. bpf_xdp_detach (mode=0)  (clear interface XDP program slot,
                                   regardless of DRV vs SKB attach)
      4. zero out s->{xsk, umem, xsk_fd, free_*, bound_mode, rings}
    plus free of the per-thread heap (free_stack, umem_area).
  - afxdp_try_bind now owns UMEM lifetime per attempt: allocates fresh
    via afxdp_alloc_umem at the top, calls the teardown helper on
    xsk_socket__create failure, returns -1.
  - afxdp_tx_init_per_thread no longer pre-allocates UMEM outside the
    bind ladder — the bind helper handles it. The redundant teardown
    in the "ladder exhausted" branch is gone (already torn down by the
    last failed attempt; only the outer struct itself remains to free).

Includes pulled in: <bpf/libbpf.h> (bpf_xdp_detach), <net/if.h>
(if_nametoindex). The libxdp/libbpf -dev packages already provide both;
no Makefile change needed since the AF_XDP build links libbpf via
pkg-config.

Test (tests/test_afxdp_teardown.sh, run via make unit-tests):
  Structural assertions on src/send-afxdp.c — libxdp behaviour requires
  a NIC + root to exercise live, so this verifies that the source
  contains the teardown primitives in the right places. A future
  regression that removes any of them fails this test without needing
  to reproduce the segfault on hardware. 10 assertions across 4
  categories: helper exists, helper calls all four primitives + state
  zero, afxdp_try_bind dispatches into the helper, single
  afxdp_alloc_umem call site (proving the UMEM allocation moved
  inside try_bind).

Verified RED: running the structural test against the pre-fix
src/send-afxdp.c (HEAD~2) fails 9 of the 10 assertions; restoring the
fix returns all 10 to GREEN.

Build verification on a host with libxdp+libbpf installed:
  make USE_AF_XDP=1 USE_DPDK=1 test → 11/11 io_engine_dispatch +
  8/8 dpdk_dispatch + 3/3 unit tests pass. No new compile warnings.

Refs: AnyVM-Tech/AnyGPT#65 issuecomment-4339242358

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@skullcrushercmd skullcrushercmd merged commit a22c8f2 into main Apr 28, 2026
@skullcrushercmd skullcrushercmd deleted the fix/c6in-metal-bench-regressions branch April 28, 2026 22:56
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f04c313171

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/send-afxdp.c
* Best-effort: the rc is intentionally ignored — if no
* program is attached, bpf_xdp_detach returns -ENOENT which
* is the desired no-op. */
(void)bpf_xdp_detach(ifindex, 0, NULL);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid detaching shared XDP program on bind retry failure

Calling bpf_xdp_detach(ifindex, 0, NULL) in the retry teardown removes whatever XDP program is attached to the interface, not just state from the failed attempt. In this codebase, run_scan starts sender threads as it iterates (src/engine.c), so a later sender’s failed zerocopy attempt can detach the program that earlier senders are already using, causing reply drops or broken AF_XDP receive routing mid-initialization; the same path also removes any pre-existing non-scanner XDP program on that interface.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant