docs: refresh perf numbers (LV-subset, apples-to-apples) by pinetops · Pull Request #19 · u2i/wallabidi

pinetops · 2026-05-03T20:49:33Z

Summary

Reruns the full perf matrix with every driver running the same 124-test LV-runnable subset, so per-test cost is directly comparable across rows.
Flips `navigation_test.exs` from `async: false` to `async: true` — those two read-only `visit page_1.html` tests didn't need to be serial. Drops LV mc16 from 25s to 18s.
Regenerates `priv/perf-matrix.svg` with the new numbers.
Updates the Drivers table to use per-test (peak concurrency) numbers and adds a clarifying paragraph about apples-to-apples vs. per-driver workload.

New numbers (16-thread M-series Mac, 124 tests)

Driver	mc1	mc2	mc4	mc8	mc16
BiDi	362s	256s	239s	—	—
CDP	161s	93s	64s	47s	48s
Lightpanda	128s	75s	50s	29s	29s
LiveView	112s	63s	32s	21s	18s

Per-test cost at each driver's recommended max-cases: LV ~145ms, LP ~234ms, CDP ~379ms, BiDi ~2.07s.

Other changes worth noting

Recommended max-cases for CDP and Lightpanda bumped from 4/16 to 8/8 — cost stops dropping past mc8 for both.
BiDi recommended stays at 2 (mc4 only buys ~7%).

Supersedes

Closes docs: correct LiveView/Lightpanda per-test numbers #13.

Test plan

CI green
Render the SVG and confirm it looks right

🤖 Generated with Claude Code

The previous table compared per-driver suites of different sizes (LV 124, LP 153, CDP 289, BiDi 285), so per-test cost couldn't be directly compared across rows. Now every driver runs the same LV-runnable 124-test subset (--exclude browser --exclude headless --exclude cdp_only) so the columns are directly comparable. Also flips integration_test/cases/browser/navigation_test.exs from async: false to async: true — those two visit-page-1 tests are read-only and don't need to be serial. This is what makes the LV mc16 number drop from 25s to 18s. New numbers (16-thread M-series Mac, 124 tests): Driver mc1 mc2 mc4 mc8 mc16 BiDi 362s 256s 239s — — CDP 161s 93s 64s 47s 48s Lightpanda 128s 75s 50s 29s 29s LiveView 112s 63s 32s 21s 18s Per-test cost at each driver's recommended max-cases: LV mc8: ~145ms, LP mc8: ~234ms, CDP mc8: ~379ms, BiDi mc2: ~2.07s Updates the priv/perf-matrix.svg accordingly. Supersedes #13. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pinetops · 2026-06-01T22:44:04Z

Status check: the 0.4.0 docs on main already (a) corrected the per-test speed claims (LiveView ~30ms, not 0ms) and (b) rewrote the --max-cases guidance to recommend the default for LiveView/Lightpanda/CDP and a cap only for BiDi.

What's not on main and remains the unique value of this PR: rerunning the matrix on a consistent LV-runnable subset (apples-to-apples per-test cost across rows) and regenerating priv/perf-matrix.svg, plus flipping navigation_test.exs to async: true. That's a methodology improvement, not just a numbers refresh.

I can't re-run the perf_bench harness from here to validate the new SVG. Leaving this open for a maintainer decision: rebase on main and keep just the apples-to-apples rerun + SVG + the async navigation_test change? If the subset rerun isn't worth maintaining, it can be closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: refresh perf numbers (LV-subset, apples-to-apples)#19

docs: refresh perf numbers (LV-subset, apples-to-apples)#19
pinetops wants to merge 1 commit into
mainfrom
docs/perf-numbers-v2

pinetops commented May 3, 2026

Uh oh!

pinetops commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pinetops commented May 3, 2026

Summary

New numbers (16-thread M-series Mac, 124 tests)

Other changes worth noting

Supersedes

Test plan

Uh oh!

pinetops commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant