refactor(opencode): split bash tool into prompt, id, and artifact orchestrator#1046
Conversation
Split the 891-line bash tool into a thinner orchestrator and three
focused modules, keeping the public tool id, schema name, and
permission key as "bash" for backward compatibility.
- tool/bash/prompt.ts owns Parameters, Limits, parameterSchema(),
render(), and per-shell chaining text for bash / pwsh / powershell
/ cmd. The "persistent shell session" wording in bash.txt is
corrected to reflect one-shot execution.
- tool/bash/id.ts holds the internal Kind taxonomy and ToolID = "bash"
alias. The file header makes clear this is not a public rename.
- tool/bash-artifact-orchestrator.ts owns expected_outputs resolution,
exact OfficeCLI target snapshots, auto-discovery before/after diff,
and recordWrite / recordUncaptured dispatch. Dependencies are
injected via a typed ArtifactDeps interface so tests can mock
individual hooks (e.g. assert discoverOfficeOutputs is not called
when expected_outputs is present).
- bash.ts execute() becomes a thin pipeline: resolve workdir,
permission scan + ask, then orchestrateArtifacts(run).
- expected_outputs description is tightened with explicit ONLY/NEVER
rules and locked behind a constant so future edits can't silently
re-broaden it.
- Adds bash-prompt.test.ts (chaining text + render snapshots for all
four shell kinds, including the powershell `cmd1; if ($?) { cmd2 }`
hint) and bash-artifact-orchestrator.test.ts (declared changed,
declared unchanged, exact OfficeCLI target, auto-discovery overflow,
read-only no noise, expected_outputs skips discovery).
bash.ts: 891 -> 696 LOC. Existing 226 bash-related tests pass
unchanged; 145 permission / registry / agent tests pass.
Refs #1038.
…uirements Two corrections from the first Codex challenge pass. shellEnv() must run AFTER the orchestrator's before-snapshots. A `shell.env` plugin may mutate files as a side effect (config drops, bundled-tool installs); the previous draft called shellEnv() before orchestrateArtifacts(), which folded any such mutation into "before" and silently dropped the resulting turn-change record. shellEnv now runs inside the runner closure so the before-state reads happen first. ArtifactDeps had typed every Effect dep as `Effect.Effect<X, never, any>`. With `any`, real service requirements (InstanceState.context for assertExternalDirectoryEffect, ChildProcessSpawner for resolveExecutionPath via cygpath, TurnChange.Service for record*) never propagated to the orchestrator's return type, so wiring it into a runtime missing one of those services would crash at run time instead of failing to typecheck. Replaced with one generic `DepR` parameter; `ArtifactDeps<DepR>` and `orchestrateArtifacts< RunR, DepR>` now bubble the union to the caller. Adds a regression test that asserts the orchestrator yields its before-snapshot reads before invoking the runner, so future re-orderings cannot reintroduce the shellEnv timing bug silently. Refs #1038.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (6)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughThis PR extracts artifact capture and output-discovery from the Bash tool into a new Effect-based orchestrator, moves prompt/parameter rendering to bash/prompt, adds shell-kind helpers, updates bash.ts to call the orchestrator, clarifies the bash tool description, and adds tests for the orchestrator and prompt modules. ChangesBash artifact orchestration refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors the bash tool by extracting artifact orchestration into bash-artifact-orchestrator.ts and prompt-related logic into bash/prompt.ts, accompanied by new unit tests. The feedback suggests removing redundant circular self-exports in id.ts and prompt.ts, simplifying the errorCode retrieval in the orchestrator, and replacing AbortSignal.any([]) with new AbortController().signal in tests to ensure compatibility with older Node.js environments.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/opencode/src/tool/bash/prompt.ts (1)
21-26: ⚡ Quick winUse the shared shell taxonomy to avoid literal drift.
chainingForcurrently hardcodes shell literals even thoughpackages/opencode/src/tool/bash/id.tsalready owns normalization. Routing throughtoKindkeeps prompt branching aligned with a single source of truth.Proposed refactor
+import { BashID } from "./id" + export function chainingFor(name: string) { - if (name === "powershell") return POWERSHELL_CHAIN - if (name === "pwsh") return PWSH_CHAIN - if (name === "cmd") return CMD_CHAIN - return BASH_CHAIN + switch (BashID.toKind(name)) { + case "powershell": + return POWERSHELL_CHAIN + case "pwsh": + return PWSH_CHAIN + case "cmd": + return CMD_CHAIN + case "bash": + default: + return BASH_CHAIN + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opencode/src/tool/bash/prompt.ts` around lines 21 - 26, chainingFor currently compares raw name strings; instead normalize via the shared toKind function from id.ts before branching so the prompt logic follows the single source of truth. Update chainingFor to call toKind(name) and switch on its result, returning POWERSHELL_CHAIN, PWSH_CHAIN, CMD_CHAIN or BASH_CHAIN as before; reference the toKind symbol and the existing chain constants (POWERSHELL_CHAIN, PWSH_CHAIN, CMD_CHAIN, BASH_CHAIN) so you avoid literal drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/opencode/src/tool/bash/prompt.ts`:
- Around line 21-26: chainingFor currently compares raw name strings; instead
normalize via the shared toKind function from id.ts before branching so the
prompt logic follows the single source of truth. Update chainingFor to call
toKind(name) and switch on its result, returning POWERSHELL_CHAIN, PWSH_CHAIN,
CMD_CHAIN or BASH_CHAIN as before; reference the toKind symbol and the existing
chain constants (POWERSHELL_CHAIN, PWSH_CHAIN, CMD_CHAIN, BASH_CHAIN) so you
avoid literal drift.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 315baa62-1be4-4bba-ac19-399f30824ca4
📒 Files selected for processing (7)
packages/opencode/src/tool/bash-artifact-orchestrator.tspackages/opencode/src/tool/bash.tspackages/opencode/src/tool/bash.txtpackages/opencode/src/tool/bash/id.tspackages/opencode/src/tool/bash/prompt.tspackages/opencode/test/tool/bash-artifact-orchestrator.test.tspackages/opencode/test/tool/bash-prompt.test.ts
- drop redundant `export * as BashID` / `BashPrompt` self-namespace re-exports; nothing consumes them and bundlers warn on the cycle. - route `chainingFor` through `toKind` so the shell taxonomy in `bash/id` is the single source of truth (no literal drift). - collapse the `"errorCode" in tracked.before` check to direct optional access — `TrackedOutputState.errorCode` is plain `string | undefined`. - swap `AbortSignal.any([])` for `new AbortController().signal` in the orchestrator test for broader Node runtime compat.
The reviewer flagged a conflict: "binary outputs" sat on the affirmative side while a bare "builds" sat on the DO NOT side, so a build that emits a specific named binary or report could be misread as out of scope and slip past turn-change recording (auto-discovery only catches Office outputs). Tighten the wording: keep routine compile/install loops in DO NOT, but explicitly call out that a build emitting a specific named deliverable counts as a deliverable artifact and must be listed. Pin both halves with new prompt tests.
unit-windows-opencode-server-tools was failing on the four orchestrator
test cases because the mock state dictionary was keyed by POSIX paths
("/tmp/work/foo.docx") while the orchestrator internally rewrites
inputs through AppFileSystem.normalizePath — on Windows that produces
"D:\\tmp\\work\\foo.docx", so every state lookup missed and the
artifact came back exists:false.
Route the test path constants through the same normalizePath helper so
the keys match on every runner; the call is a no-op on POSIX so the
local pass-rate is unchanged.
The two diffFull timeouts and the loop-finish timeout in the same run
are pre-existing flakes on unrelated paths; leaving them for a
separate follow-up.
bash/id.ts already exports `ToolID = "bash"` as the compatibility-id single source of truth, but the two literal "bash" sites in bash.ts were left untouched at extraction time. Replace them so the saved- permission key and the Tool.define id both trace back to the constant — future shell-kind work can rename internal identifiers without risking a literal-vs-constant drift on the public surface.
Summary
Refactor PawWork's 891-line
bash.tsinto a thinner pipeline by extracting prompt rendering, internal shell-kind taxonomy, and PawWork-specific artifact orchestration into focused modules. Keeps the public tool id, schema name, and permission key asbashfor backward compatibility.Plan and review history live on issue #1038 (initial plan, revised plan after first review).
Changes
packages/opencode/src/tool/bash/prompt.tsownsParameters,Limits,parameterSchema(),render(), and per-shell chaining text forbash/pwsh/powershell/cmd. Powershell 5.1 keeps thecmd1; if ($?) { cmd2 }advice; pwsh 7+ gets&&plus a 5.1 fallback hint; cmd gets&&framed for cmd.exe.packages/opencode/src/tool/bash/id.tscarries the internalKindtaxonomy andToolID = "bash"compatibility constant. File header explicitly states this is not a public rename.packages/opencode/src/tool/bash-artifact-orchestrator.tsownsexpected_outputsresolution, exact OfficeCLI target snapshots, auto-discovery before/after diff, andrecordWrite/recordUncaptureddispatch. Dependencies arrive via a typedArtifactDeps<DepR>interface — tests can mock individual hooks (e.g. assertdiscoverOfficeOutputsis not called whenexpected_outputsis present).bash.tsexecute()becomes a thin pipeline: validate → resolveworkdir→ permission scan + ask →orchestrateArtifacts(input, run, deps). 891 → 696 LOC.bash.txtfirst line corrected: the previous "persistent shell session" wording was inaccurate — each call spawns a fresh process. New wording is explicit about that and points the model atworkdir.expected_outputsdescription tightened with explicitONLY/DO NOT set it forrules and locked behind an exported constant (EXPECTED_OUTPUTS_DESCRIPTION) so a future edit can't silently re-broaden the rule.Out of scope (deferred to follow-ups)
bash→shell. Tool name, schema, and"bash"permission key unchanged.cmd.exe-specific spawn branch andCMD_FILESpermission scan. Documented for a follow-up.&call-operator gap incommandHead()(bash-office-artifacts.ts). Surfaced by Codex review on this branch but not introduced here; will track separately.Upstream alignment
Mirrors the
anomalyco/opencodesplit (tool/shell.ts+tool/shell/prompt.ts+tool/shell/id.ts) at the structural level, but keeps the PawWork-specific Office artifact path intact and the public tool name unchanged.Verification
Local in worktree:
bun run typecheck— pass.bun test test/tool/ test/permission/— 648 pass, 1 skip, 0 fail across 33 files.bash.test.ts,bash-office-artifacts.test.ts,bash-write-heuristic.test.ts,bash-env-source.test.ts, plus newbash-prompt.test.tsandbash-artifact-orchestrator.test.ts) — 226 pass, 0 fail.Review trail
Codex challenge pass against the initial commit flagged two issues, both fixed in
fa34fdf:shellEnv()beforeorchestrateArtifacts(), which would have folded anyshell.envplugin file side effect into the "before" snapshot. Moved into the runner closure so before-state reads happen first. New test (before-snapshot precedes runner — protects against shell.env-style side-effect ordering regressions) locks the invariant.Effect.Effect<X, never, any>, hiding real service requirements. Replaced with a singleDepRgeneric soorchestrateArtifacts<RunR, DepR>returnsEffect<X, never, RunR | DepR>— service requirements now propagate at the type level.Codex re-review of the fixup confirmed the corrections are effective and no new issues introduced.
Refs #1038.
Test plan
bun run typecheckpasses locally.test/tool/andtest/permission/pass locally.bashtool id, schema, and"bash"permission key unchanged.dev-targeted checks.Summary by CodeRabbit
New Features
Refactor
Bug Fixes
Documentation
Tests