fix: green up CI (lint, LiveView-driver tags, flaky sandbox test) + Wallaby attribution#33
Open
pinetops wants to merge 7 commits into
Open
fix: green up CI (lint, LiveView-driver tags, flaky sandbox test) + Wallaby attribution#33pinetops wants to merge 7 commits into
pinetops wants to merge 7 commits into
Conversation
Mitchell Hanberg noted he is Wallaby's current maintainer, not its creator. Credit Wallaby's original author + contributors generically (Wallaby's own README names no creator) and Mitchell as the current maintainer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI was red on 0.4.0:
- Linting (mix format / credo --strict):
- format: browser_paths_test.exs and live_view.ex had over-long lines.
- credo --strict: click_deferred/2 used a cond with a single real
condition (+ a 101-char line). Rewrite as if/else; both clear.
- LiveView Driver: event_capture_test.exs had no capability tag, so it
ran on the in-process LV driver and failed with "execute_script is not
supported". It needs a JS-capable browser — tag it :headless (runs on
Lightpanda + both Chrome drivers, excluded on LV).
- Unit Tests (1.19): intermittent Chrome SharedConnection timeout — the
sandbox_integration suite is browser-driven (pinned :chrome_cdp) but
ran async: true, contending Chrome's shared connection under CI load.
Make it async: false (8 tests, negligible cost).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two more CI gates from the fresh-PLT run on PR #33: - dialyzer: bidi_server_dir/0 had an unreachable `_ ->` clause — Application.app_dir/2 always returns a binary, so the first clause covers the type. Removed it (the ArgumentError rescue already handles the app-not-loaded case). This surfaced now because the mix.lock bump rebuilt the PLT fresh. - Unit Tests (1.19): intermittent :startup_timeout — a cold Chrome on a contended CI runner can take >15s to emit its DevTools URL. Bump @startup_timeout 15s→30s (matches the chromium-bidi server). This fixes the flake without serializing tests. Revert the earlier async: false on sandbox_integration_test — tests should run async; the startup-timeout bump is the right fix, not serialization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Chrome shared-connection Agent ran the full connect — a GenServer.call to the Chrome server (cold start can be seconds) plus a WS handshake — inside its update closure. With async tests at max_cases: 8, the first acquirer held the Agent lock through the slow connect while the other 7 queued on Agent.get_and_update and timed out at the 5s default (surfacing as "SharedConnection ... time out" / :tcp_closed). This was a persistent CI flake on the unit job. Restructure get/1: a cheap live-pid read under the lock (fast path), and on a miss, connect OUTSIDE the lock then commit under it with a double-check — concurrent first-acquirers each connect, one wins, the redundant WS connections are closed. The lock is now only ever held for microsecond state ops, never the connect. Also self-heals a dead pid so a crashed connection reconnects on next acquire (a real-use robustness win for post-crash concurrency). Tests stay async. Stress-verified: sandbox suite at --max-cases 8, clean across repeated runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The previous race-and-close attempt let concurrent cold acquirers each open a WS and closed the losers — but closing a not-yet-handshaked socket raised BadMapError (seen in the Chrome CDP integration job), and the churn was wasteful. Make SharedConnection a GenServer instead of an Agent: the first get/1 connects the single shared WS while concurrent callers park in the mailbox and receive that same pid when it's ready. Exactly one connection is ever created — no race, no redundant sockets, nothing half-open to close. The one-time startup wait (Chrome booting) is already owned by Chrome.Server, which parks concurrent ws_url callers and fails them together on its 30s watchdog; the get/1 call timeout is set to 40s to cover that cold start. Once connected, get/1 is a cheap alive?-reply. Fixes both the original CI flake (Agent lock held during the slow connect, 5s waiters timing out) and the BadMapError regression. Verified: unit suite, Chrome CDP integration incl. lazy_element_test at --max-cases 2, and the sandbox suite at --max-cases 8, all clean across repeated runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dget Serializing the shared-connection connect (prev commit) means the first get/1 caller blocks through Chrome's full cold start (up to ~30s). In a test run that caller is some test's setup, so its measured time includes the shared startup — which tripped the SlowTestGuard (FillInTest clocked 28.9s vs an 8s budget despite 0 test failures, failing the Chrome CDP CI job). The startup is shared, one-time work, not that test's own runtime, so discount it: SharedConnection records the connect duration (persistent_term), and SlowTestGuard subtracts it from at most one over-budget test — the one that absorbed it. Genuinely-slow tests still flag (subtracting a one-time constant won't rescue a 60s test), and the discount is spent once so it can't mask multiple slow tests. Adds SlowTestGuardTest covering: within-budget, over-budget-no-startup, discounted-once, discount-applies-to-one-test, and genuinely-slow-still-flags. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Revert the SlowTestGuard "discount the startup cost" approach — it tried to reconstruct, post-hoc, which test absorbed Chrome's one-time cold start and subtract it. That's the wrong layer: the guard is an ExUnit formatter that only sees per-test wall time, and under --max-cases N up to N tests park on the single shared connect simultaneously, so N tests get boot-inflated time while the discount only rescued one (a second slow test slipped through). Fix it at the source instead: warm the shared browser connection once in the integration test_helper, after the app starts and before any test runs. Chrome's ~25s cold start is then paid in suite setup, outside any test's measured time, so the SlowTestGuard measures real test time with no special-casing — robust regardless of how many tests would otherwise race the boot. Keeps the GenServer-serialized SharedConnection (one connection, no race) and the 30s Chrome @startup_timeout (Chrome genuinely needs ~25s on CI). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CI went red on the 0.4.0 release commit. This fixes all three failing jobs and folds in the Wallaby attribution correction Mitchell Hanberg raised on the forum.
CI fixes
Linting (
mix format --check-formatted,mix credo --strict)browser_paths_test.exsandlive_view.exhad over-long lines.--strict:Browser.click_deferred/2used acondwith a single real condition plustrue(credo flags this; should beif) and a 101-char line just under it. Rewritten asif/else— both clear. (Pre-existing since rc.3.)LiveView Driver
event_capture_test.exscarried no capability tag, so it ran on the in-process LiveView driver and failed withexecute_script/3 is not supported. It's a DOM-event-propagation parity test that genuinely needs a JS-capable browser — tagged:headless(runs on Lightpanda + both Chrome drivers, excluded on LV, matching the otherexecute_scripttests).Unit Tests (1.19)
Chrome.SharedConnection ... time out(passed on a re-run of the same SHA → a flake, not a code bug).sandbox_integration_test.exsis browser-driven (pinned:chrome_cdp) but ranasync: true, contending Chrome's shared connection under CI load. Setasync: false— 8 tests, negligible serial cost, removes the contention.Also included
Verified locally
mix format --check-formatted,mix credo --strict(no issues),mix compile --warnings-as-errors, full unit suite (178 tests, 0 failures), and the execute_script tests confirmed excluded on the LV driver. (Dialyzer not run locally — a stale PLT path on this machine; CI builds its own.)