Skip to content

fix: green up CI (lint, LiveView-driver tags, flaky sandbox test) + Wallaby attribution#33

Open
pinetops wants to merge 7 commits into
mainfrom
fix/ci-lint-and-driver-tags
Open

fix: green up CI (lint, LiveView-driver tags, flaky sandbox test) + Wallaby attribution#33
pinetops wants to merge 7 commits into
mainfrom
fix/ci-lint-and-driver-tags

Conversation

@pinetops

@pinetops pinetops commented Jun 2, 2026

Copy link
Copy Markdown
Member

CI went red on the 0.4.0 release commit. This fixes all three failing jobs and folds in the Wallaby attribution correction Mitchell Hanberg raised on the forum.

CI fixes

Linting (mix format --check-formatted, mix credo --strict)

  • Formatting: browser_paths_test.exs and live_view.ex had over-long lines.
  • credo --strict: Browser.click_deferred/2 used a cond with a single real condition plus true (credo flags this; should be if) and a 101-char line just under it. Rewritten as if/else — both clear. (Pre-existing since rc.3.)

LiveView Driver

  • event_capture_test.exs carried no capability tag, so it ran on the in-process LiveView driver and failed with execute_script/3 is not supported. It's a DOM-event-propagation parity test that genuinely needs a JS-capable browser — tagged :headless (runs on Lightpanda + both Chrome drivers, excluded on LV, matching the other execute_script tests).

Unit Tests (1.19)

  • Intermittent Chrome.SharedConnection ... time out (passed on a re-run of the same SHA → a flake, not a code bug). sandbox_integration_test.exs is browser-driven (pinned :chrome_cdp) but ran async: true, contending Chrome's shared connection under CI load. Set async: false — 8 tests, negligible serial cost, removes the contention.

Also included

  • docs: Wallaby attribution — credit Mitchell Hanberg as Wallaby's current maintainer, not creator (per his forum note); attribute original authorship to Wallaby's author + contributors generically.

Verified locally

mix format --check-formatted, mix credo --strict (no issues), mix compile --warnings-as-errors, full unit suite (178 tests, 0 failures), and the execute_script tests confirmed excluded on the LV driver. (Dialyzer not run locally — a stale PLT path on this machine; CI builds its own.)

pinetops and others added 7 commits June 1, 2026 17:20
Mitchell Hanberg noted he is Wallaby's current maintainer, not its
creator. Credit Wallaby's original author + contributors generically
(Wallaby's own README names no creator) and Mitchell as the current
maintainer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI was red on 0.4.0:

- Linting (mix format / credo --strict):
  - format: browser_paths_test.exs and live_view.ex had over-long lines.
  - credo --strict: click_deferred/2 used a cond with a single real
    condition (+ a 101-char line). Rewrite as if/else; both clear.
- LiveView Driver: event_capture_test.exs had no capability tag, so it
  ran on the in-process LV driver and failed with "execute_script is not
  supported". It needs a JS-capable browser — tag it :headless (runs on
  Lightpanda + both Chrome drivers, excluded on LV).
- Unit Tests (1.19): intermittent Chrome SharedConnection timeout — the
  sandbox_integration suite is browser-driven (pinned :chrome_cdp) but
  ran async: true, contending Chrome's shared connection under CI load.
  Make it async: false (8 tests, negligible cost).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two more CI gates from the fresh-PLT run on PR #33:

- dialyzer: bidi_server_dir/0 had an unreachable `_ ->` clause —
  Application.app_dir/2 always returns a binary, so the first clause
  covers the type. Removed it (the ArgumentError rescue already handles
  the app-not-loaded case). This surfaced now because the mix.lock bump
  rebuilt the PLT fresh.
- Unit Tests (1.19): intermittent :startup_timeout — a cold Chrome on a
  contended CI runner can take >15s to emit its DevTools URL. Bump
  @startup_timeout 15s→30s (matches the chromium-bidi server). This fixes
  the flake without serializing tests.

Revert the earlier async: false on sandbox_integration_test — tests
should run async; the startup-timeout bump is the right fix, not
serialization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Chrome shared-connection Agent ran the full connect — a GenServer.call
to the Chrome server (cold start can be seconds) plus a WS handshake —
inside its update closure. With async tests at max_cases: 8, the first
acquirer held the Agent lock through the slow connect while the other 7
queued on Agent.get_and_update and timed out at the 5s default
(surfacing as "SharedConnection ... time out" / :tcp_closed). This was a
persistent CI flake on the unit job.

Restructure get/1: a cheap live-pid read under the lock (fast path), and
on a miss, connect OUTSIDE the lock then commit under it with a
double-check — concurrent first-acquirers each connect, one wins, the
redundant WS connections are closed. The lock is now only ever held for
microsecond state ops, never the connect. Also self-heals a dead pid so a
crashed connection reconnects on next acquire (a real-use robustness win
for post-crash concurrency).

Tests stay async. Stress-verified: sandbox suite at --max-cases 8, clean
across repeated runs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The previous race-and-close attempt let concurrent cold acquirers each
open a WS and closed the losers — but closing a not-yet-handshaked socket
raised BadMapError (seen in the Chrome CDP integration job), and the churn
was wasteful.

Make SharedConnection a GenServer instead of an Agent: the first get/1
connects the single shared WS while concurrent callers park in the mailbox
and receive that same pid when it's ready. Exactly one connection is ever
created — no race, no redundant sockets, nothing half-open to close. The
one-time startup wait (Chrome booting) is already owned by Chrome.Server,
which parks concurrent ws_url callers and fails them together on its 30s
watchdog; the get/1 call timeout is set to 40s to cover that cold start.
Once connected, get/1 is a cheap alive?-reply.

Fixes both the original CI flake (Agent lock held during the slow connect,
5s waiters timing out) and the BadMapError regression. Verified: unit
suite, Chrome CDP integration incl. lazy_element_test at --max-cases 2,
and the sandbox suite at --max-cases 8, all clean across repeated runs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dget

Serializing the shared-connection connect (prev commit) means the first
get/1 caller blocks through Chrome's full cold start (up to ~30s). In a
test run that caller is some test's setup, so its measured time includes
the shared startup — which tripped the SlowTestGuard (FillInTest clocked
28.9s vs an 8s budget despite 0 test failures, failing the Chrome CDP CI
job).

The startup is shared, one-time work, not that test's own runtime, so
discount it: SharedConnection records the connect duration (persistent_term),
and SlowTestGuard subtracts it from at most one over-budget test — the one
that absorbed it. Genuinely-slow tests still flag (subtracting a one-time
constant won't rescue a 60s test), and the discount is spent once so it
can't mask multiple slow tests.

Adds SlowTestGuardTest covering: within-budget, over-budget-no-startup,
discounted-once, discount-applies-to-one-test, and genuinely-slow-still-flags.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Revert the SlowTestGuard "discount the startup cost" approach — it tried
to reconstruct, post-hoc, which test absorbed Chrome's one-time cold
start and subtract it. That's the wrong layer: the guard is an ExUnit
formatter that only sees per-test wall time, and under --max-cases N up to
N tests park on the single shared connect simultaneously, so N tests get
boot-inflated time while the discount only rescued one (a second slow test
slipped through).

Fix it at the source instead: warm the shared browser connection once in
the integration test_helper, after the app starts and before any test
runs. Chrome's ~25s cold start is then paid in suite setup, outside any
test's measured time, so the SlowTestGuard measures real test time with no
special-casing — robust regardless of how many tests would otherwise race
the boot.

Keeps the GenServer-serialized SharedConnection (one connection, no race)
and the 30s Chrome @startup_timeout (Chrome genuinely needs ~25s on CI).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant