feat: migrate run-agent.py from subprocess to opencode serve HTTP+SSE API#17
Merged
Conversation
Coverage Report
Generated by pytest-cov on |
… API
Replace opencode run --format json subprocess with direct HTTP+SSE
integration against a locally-managed opencode serve instance.
Architecture:
- tools/opencode/serve.py: ServerRunner (start/stop/health lifecycle)
- tools/events/sse_client.py: SSE consumer with reconnect + heartbeat
- tools/events/state_tracker.py: Accumulate deltas, track part versions
- tools/events/emitters.py: Bridge mapped events to render_event()
- tools/events/__init__.py: EventLoop coordinator
Key behaviors:
- One opencode serve instance per phase, ephemeral port
- Session created via POST /session, configured via PATCH
- Prompt sent via POST /session/{id}/prompt_async
- Global SSE stream consumed via GET /event, filtered by session ID
- Auto-reject permissions with visible warning
- Server terminated on exit (session left in DB)
- Minimum opencode version: 1.14.50
All existing rendering code preserved (~4000 lines).
Termination controls unchanged: graceful completion, auto-resume,
frontmatter validation, retry budget.
Tests:
- 29 new unit/E2E tests in tests/test_new_serve_stack.py
- 4 integration tests in tests/test_run_agent.py skipped (TODO: rewrite)
- make tests: 216 passed, 4 skipped, 0 failed
Adds a mock LLM provider and parity harness to validate structural equivalence between opencode run --format json and opencode serve.
New files:
- tools/mock_llm_server.py — stdlib-only OpenAI-compatible mock LLM with multi-turn script support.
- tools/mock_llm_parity.py — parity runner using ephemeral mock server
- tools/mock_llm_scripts/{basic,with_tool,with_permission}.json
- tests/test_mock_llm_parity.py — unit + E2E parity tests
Bug fix in tools/events/__init__.py:
- Removed premature early-return on session.status (idle) in EventLoop.run().
This caused the serve path to miss the final step_finish because
_sync_session_messages() hadn't polled the snapshot yet.
Config:
- opencode.json: adds provider.test block for mock model
- Makefile: adds test-parity target
Removed:
- tools/opencode-parity.py (old non-deterministic approach)
- tests/test_opencode_parity.py
5e35d56 to
7efc951
Compare
… sorting - mock_llm_server.py: extract _build_chunks() for direct unit-testing; add multi-tool index support (0,1,2...) per assistant message. - mock_llm_parity.py: add per-step event sorting to normalize tool-use ordering differences between opencode run and serve; whitelist file.edited / file.watcher.updated / todo.updated as serve-only. - comprehensive.json: cover read, glob, grep, write, bash, todowrite in a single multi-turn script (edit and skill removed due to path inconsistencies). - with_permission_multi.json: permission-reject + allowed flow. - Remove with_apply_patch.json (apply_patch unavailable). - All parity tests pass; full suite 232 passed, 4 skipped.
pruiz
commented
May 19, 2026
Owner
Author
pruiz
left a comment
There was a problem hiding this comment.
This branch is not ready, here are some pending comments for review..
…re env var, rewrite skipped tests - Rename mock_llm_server.py → mock-llm-server.py and mock_llm_parity.py → mock-llm-parity.py; update all imports (use importlib for hyphenated filenames), Makefile, and .project docs. - Makefile test-parity: now runs pytest over all parameterized scripts. - Fix path reference in migrate-to-opencode-serve.md (itemdb/notes → .project). - Fix EventLoop ordering in mock-llm-parity.py: start SSE consumer thread BEFORE sending prompt_async to avoid losing early events. - Restore _CODECOME_INSIDE_HARNESS=1 in run-agent.py so status-forwarder.ts and future local plugins activate inside the serve process. - Rewrite 4 skipped tests in test_run_agent.py for serve architecture: test_auto_correction_resume_loops_back_via_popen, test_frontmatter_failure_without_session_id_exits_nonzero, test_iteration_limit_triggers_auto_resume, test_stream_session_id_and_step_finish_count. Full suite: 236 passed, 1 warning, 0 failed.
- _consume_events() now takes thinking_on and drops reasoning events from display when false (still logged to transcript). - _run_single_attempt() passes thinking_on through. - Remove dead build_child_command() (was subprocess-era only). - Makefile: add OPENCODE_THINKING_FLAG; pass --thinking in raw mode (CODECOME_USE_WRAPPER=0) when CODECOME_THINKING=1. - Update README.md and docs/workflow.md to describe new behavior. - Remove 3 now-dead build_child_command tests.
pruiz
commented
May 19, 2026
Replace --port 0 + stdout parsing with _find_free_port() helper that binds a socket to port 0, reads the assigned port, then passes it explicitly to the mock server. This removes the race condition where the server might die before we read its bound port. - tools/mock-llm-parity.py: add _find_free_port(), simplify start_mock_server() - tests/test_mock_llm_parity.py: same approach in server_proc fixture
Add 'slow: heavy e2e tests (invoke real opencode CLI)' to pytest.ini markers so @pytest.mark.slow on test_parity_script stops emitting PytestUnknownMarkWarning.
… mode - render_unknown() now shows the actual unknown part type for message.part.updated events (e.g. 'unknown part type: image') instead of the generic wrapper type. - Add CODECOME_DEBUG_UNKNOWN_EVENTS env var; when set, prints the full JSON payload of unknown events to help diagnose new event shapes during development.
The mock-llm-parity tests call 'opencode run' as a subprocess via run_reference(), which requires the opencode binary to be in PATH.
- Add on_reconnect callback to SseClient (fires after successful reconnection) - EventLoop now triggers sync only on: (1) SSE reconnect, (2) session idle - Remove periodic sync triggers (heartbeat/diff/status every 0.5s) - Add _emitted_signatures set for EventLoop-level deduplication - Defer idle event emission until after sync to ensure correct ordering - Fix StateTracker bug where mark_seen wasn't called when skipping seen parts
… injection - mock-llm-server.py: add --429-after and --500-after flags for error injection - mock-llm-parity.py: set _CODECOME_INSIDE_HARNESS=1 in run_reference to activate status-forwarder plugin; normalize session.status and session.error events for comparison; deduplicate session.status events by (status_type, status_message) - mock-llm-parity.py: add --429-after and --500-after CLI args - Add test_parity_script_with_error for rate_limit_retry and internal_error scripts - New mock scripts: rate_limit_retry.json, internal_error.json - session.status removed from _SERVE_ONLY_TYPES (now properly compared) - _step_sort_key updated to include session.status/error in step ordering
Updated the installation step for opencode to fetch the latest version using GitHub API. Signed-off-by: Pablo Ruiz García <pruiz@users.noreply.github.com>
Signed-off-by: Pablo Ruiz García <pruiz@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace the subprocess-based
opencode run --format jsonarchitecture with a direct HTTP+SSE integration againstopencode serve. This improves reliability (official API surface), adds real-time event streaming, and reduces dependency on CLI stdout format stability.Changes
New packages
tools/opencode/serve.py—ServerRunnerclass: start/stop server on ephemeral port, health checks, convenience CLItools/events/sse_client.py— SSE consumer with auto-reconnect, heartbeat monitoring, exponential backofftools/events/state_tracker.py— Accumulatemessage.part.deltafragments, detect finalized partstools/events/emitters.py— Thin bridge to existingrender_event()tools/events/__init__.py—EventLoopcoordinatorRefactored
tools/run-agent.py:opencode runsubprocess loop (~600 lines)ServerRunnerorchestration: start server → create session → send prompt → consume SSE1.14.39 → 1.14.50Tests
tests/test_new_serve_stack.pycovering StateTracker, SseClient, EventLoop, ServerRunner, and end-to-end flowstests/test_run_agent.pyskipped withTODO: rewritecommentsTest results
Migration plan
See
.project/migrate-to-opencode-serve.mdfor full architecture decisions and risk analysis.