feat(context-drop): drop a screen video clip, not just a screenshot#1859
Conversation
Adds a "Drop Video Clip" hotkey (⌃R) that records a few seconds of screen and drops the .mov as a task. A short video repro is far easier to act on than a single still, and it captures UI bugs a screenshot misses (e.g. the stuck-loading carousel we just chased). - screen-capture-server.py: new GET /capture-video?seconds=N&display=D. Runs `screencapture -v -V<n> -x` on this server because it already holds the Screen Recording TCC grant (same reason /capture lives here). Duration clamped 1..60s, display 1..9, int-coerced like /capture to keep request taint out of the subprocess argv. Verifies a non-empty .mov before 200. - main.swift: dropVideoClip() mirrors dropScreenshot() — debounced, hits /capture-video, writes a `type: video` task. Pre-notifies "Recording Ns…" since unlike a still it takes a few seconds. Wired as drop_video_clip (menu item + ⌃R default hotkey, configurable via hotkeys.json). The task pipeline already handles .mov, so dropped clips flow through unchanged. Tested: server py_compiles; main.swift + SutandoConfig.swift compile with the startup.sh build command; `screencapture -v -V1 -x` produces a valid .mov on macOS 26. Manual hotkey end-to-end pending (UI). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hot branch
/capture-video also matches startswith("/capture"), so the screenshot
branch intercepted it and returned a PNG. Exclude it explicitly so the
video endpoint actually records. Caught self-testing before merge.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Locks the routing bug fixed in 6eea6a2: /capture-video must record a .mov and not be swallowed by the /capture screenshot branch. Mocks screencapture so it runs headless. Also asserts /capture still returns .png, duration clamping, and the source-level routing guard. Independently confirmed as the sole finding in Sutando-Pro's review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed two commits to this branch (the commits list shows authored-time; these reached GitHub via a push at 17:15 UTC):
CI re-ran green. Branch is now feature + fix + test. |
bassilkhilo-ag2
left a comment
There was a problem hiding this comment.
✅ Routing fix is correct — the explicit and not self.path.startswith('/capture-video') guard is the right fix for the prefix collision (would have silently returned a PNG otherwise). Duration and display index both sanitized before injecting into argv. Test mocks screencapture cleanly. Approve.
Use patch.object + addCleanup instead of bare attribute assignment so the fake screencapture (and silenced notify helpers) are restored after each test, not leaked into later tests sharing the process. Addresses Sutando-Pro's review finding on b43f505. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed The setUp did a bare |
Addresses Sutando-Pro's review finding: /capture-video was an unauthenticated side-effectful GET, so a drive-by web page could silently trigger a 1-60s screen recording. The server now generates a token at startup, writes it to a 0600 file (~/.config/sutando/screen-capture-token), and requires it in the X-Sutando-Capture-Token header, checked before any side effect. The Ctrl-R hotkey caller reads the file and sends the header. A browser can neither read the 0600 file nor set a custom header on a no-cors/<img> request, so it cannot reach the endpoint. Pre-existing /capture hole left for a follow-up per the review. Test: adds a 403-without-token case; /capture-video tests send the token. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Pushed
|
bassilkhilo-ag2
left a comment
There was a problem hiding this comment.
Clean implementation that mirrors dropScreenshot precisely. Key correctness points: (1) token gate on /capture-video so a web page in the capture server process cannot trigger recording via ; (2) notify() call before the blocking request so the user knows capture started (a silent 5s wait would feel broken); (3) debounce guard reuses lastDropTime; (4) duration clamped server-side 1..60s; (5) generous timeout = seconds+30 to cover capture + network. The ⌃R binding does not conflict with any existing hotkey. ✓
bassilkhilo-ag2
left a comment
There was a problem hiding this comment.
Good feature. ⌃R hotkey, debounced same as dropScreenshot, notify-before-wait (user sees 'Recording 5s...' instead of silence). Token auth via X-Sutando-Capture-Token mirrors the screenshot pattern. Generous 35s timeout for the blocking capture. LGTM.
…W + 0600 owner re-check (review Fix #1)
Per Susan: ⌃R should start recording, then stop+drop on a second press — user-controlled length, not a fixed duration. /capture-video gains ?action=start (open-ended screencapture -v via Popen, returns at once) and ?action=stop (SIGINT finalizes the .mov, returns path), with a single-recording lock + a MAX_RECORDING_SECONDS watchdog so a forgotten recording can't run forever. No-action requests keep the legacy fixed-duration path (backward compat). Swift dropVideoClip() becomes a stateful toggle. Tests cover start->stop, stop-when-idle, double-start guard, and token gate on start.
…eo path The fixed-duration -V capture was the pre-toggle behavior and had no callers left once ⌃R became a start/stop toggle. Remove it so /capture-video is toggle-only: no/unknown action now returns 400. Tests updated (drop the fixed-duration cases, add no-action->400).
|
@cla-assistant check |
…keys off terminal keys (#1890) * fix(context-drop): capture audio in drop-video-clip + move hotkeys off terminal keys drop-video-clip (added in #1859) recorded silent video: the capture server ran `screencapture -v -x` with no audio flag. Add `-g` so the clip captures the default audio input. The source follows System Settings > Sound > Input: the mic for narration/room audio, or BlackHole (with output routed there) for system/app audio. `?audio=off` disables sound; `?device=<id>` pins a specific input via `-G<id>`. Also move the two global hotkeys that shadowed terminal keys: - drop_context: Ctrl-C -> Ctrl-Shift-C (was intercepting SIGINT/cancel) - drop_video_clip: Ctrl-R -> Ctrl-Shift-R (was intercepting reverse-i-search) Global hotkeys match the exact modifier mask, so plain Ctrl-C / Ctrl-R still reach the terminal; only the shifted variants trigger Sutando. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(context-drop): disclose audio in recording HUD + fix stale Ctrl-R label Addresses review of #1890: the drop-video-clip start notification read "Recording screen - press Ctrl-R again to stop", which (a) didn't disclose that mic audio is now captured and (b) still referenced Ctrl-R after the hotkey moved to Ctrl-Shift-R. Update to "Recording screen + mic - press Ctrl-Shift-R again to stop". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem
Context-drop can drop a screenshot (⌃S), but a single still often can't reproduce a bug — the carousel stuck-loading issue we chased today is invisible in a screenshot. A short screen video is a far better repro to hand the agent.
What this adds
A "Drop Video Clip" action (⌃R) that records a few seconds of screen and drops the
.movas a task, mirroring the existing screenshot drop.src/screen-capture-server.py— newGET /capture-video?seconds=N&display=D. Runsscreencapture -v -V<n> -xon this server because it already holds the Screen Recording TCC grant (same reason/capturelives here). Duration clamped1..60s, display1..9, int-coerced exactly like/captureso request data never reaches the subprocess argv. Verifies a non-empty.movbefore returning200 {status, path, seconds}.src/Sutando/main.swift—dropVideoClip()mirrorsdropScreenshot(): debounced, hits/capture-video, writes atype: videotask. It pre-notifies "Recording Ns…" since (unlike a still) it takes a few seconds. Wired as thedrop_video_clipaction — menu item +⌃Rdefault hotkey, configurable via~/.config/sutando/hotkeys.jsonlike the others.The task pipeline already handles
.mov, so dropped clips flow through unchanged.Test
python3 -m py_compile src/screen-capture-server.py— clean.swiftc -O main.swift SutandoConfig.swift -framework Cocoa -framework Carbon -framework ApplicationServices -framework AVFoundation(the startup.sh build command) — compiles clean.screencapture -v -V1 -x out.movproduces a valid.movon macOS 26.Notes
seconds(the vision ticker's/capturewaits). Acceptable for a rare manual action; switching toThreadingHTTPServeris a clean follow-up if it matters.