Skip to content

refactor(opencode): split bash tool into prompt, id, and artifact orchestrator#1046

Merged
Astro-Han merged 6 commits into
devfrom
claude/i1038-bash-prompt-orchestrator
Jun 1, 2026
Merged

refactor(opencode): split bash tool into prompt, id, and artifact orchestrator#1046
Astro-Han merged 6 commits into
devfrom
claude/i1038-bash-prompt-orchestrator

Conversation

@Astro-Han
Copy link
Copy Markdown
Owner

@Astro-Han Astro-Han commented Jun 1, 2026

Summary

Refactor PawWork's 891-line bash.ts into a thinner pipeline by extracting prompt rendering, internal shell-kind taxonomy, and PawWork-specific artifact orchestration into focused modules. Keeps the public tool id, schema name, and permission key as bash for backward compatibility.

Plan and review history live on issue #1038 (initial plan, revised plan after first review).

Changes

  • New packages/opencode/src/tool/bash/prompt.ts owns Parameters, Limits, parameterSchema(), render(), and per-shell chaining text for bash / pwsh / powershell / cmd. Powershell 5.1 keeps the cmd1; if ($?) { cmd2 } advice; pwsh 7+ gets && plus a 5.1 fallback hint; cmd gets && framed for cmd.exe.
  • New packages/opencode/src/tool/bash/id.ts carries the internal Kind taxonomy and ToolID = "bash" compatibility constant. File header explicitly states this is not a public rename.
  • New packages/opencode/src/tool/bash-artifact-orchestrator.ts owns expected_outputs resolution, exact OfficeCLI target snapshots, auto-discovery before/after diff, and recordWrite / recordUncaptured dispatch. Dependencies arrive via a typed ArtifactDeps<DepR> interface — tests can mock individual hooks (e.g. assert discoverOfficeOutputs is not called when expected_outputs is present).
  • bash.ts execute() becomes a thin pipeline: validate → resolve workdir → permission scan + ask → orchestrateArtifacts(input, run, deps). 891 → 696 LOC.
  • bash.txt first line corrected: the previous "persistent shell session" wording was inaccurate — each call spawns a fresh process. New wording is explicit about that and points the model at workdir.
  • expected_outputs description tightened with explicit ONLY / DO NOT set it for rules and locked behind an exported constant (EXPECTED_OUTPUTS_DESCRIPTION) so a future edit can't silently re-broaden the rule.

Out of scope (deferred to follow-ups)

  • Public rename bashshell. Tool name, schema, and "bash" permission key unchanged.
  • cmd.exe-specific spawn branch and CMD_FILES permission scan. Documented for a follow-up.
  • Metadata streaming throttle. Separate PR with its own performance motivation and throttle-specific regression tests (PR 2 of the Refactor Bash tool toward upstream shell-aware structure #1038 plan).
  • Tree-sitter per-shell lazy WASM load. Optional follow-up (PR 3 of the Refactor Bash tool toward upstream shell-aware structure #1038 plan).
  • Pre-existing PowerShell & call-operator gap in commandHead() (bash-office-artifacts.ts). Surfaced by Codex review on this branch but not introduced here; will track separately.

Upstream alignment

Mirrors the anomalyco/opencode split (tool/shell.ts + tool/shell/prompt.ts + tool/shell/id.ts) at the structural level, but keeps the PawWork-specific Office artifact path intact and the public tool name unchanged.

Verification

Local in worktree:

  • bun run typecheck — pass.
  • bun test test/tool/ test/permission/ — 648 pass, 1 skip, 0 fail across 33 files.
  • Bash-specific suite (bash.test.ts, bash-office-artifacts.test.ts, bash-write-heuristic.test.ts, bash-env-source.test.ts, plus new bash-prompt.test.ts and bash-artifact-orchestrator.test.ts) — 226 pass, 0 fail.

Review trail

Codex challenge pass against the initial commit flagged two issues, both fixed in fa34fdf:

  1. shellEnv ordering regression. First draft called shellEnv() before orchestrateArtifacts(), which would have folded any shell.env plugin file side effect into the "before" snapshot. Moved into the runner closure so before-state reads happen first. New test (before-snapshot precedes runner — protects against shell.env-style side-effect ordering regressions) locks the invariant.
  2. ArtifactDeps type leakage. Initial draft typed Effect deps as Effect.Effect<X, never, any>, hiding real service requirements. Replaced with a single DepR generic so orchestrateArtifacts<RunR, DepR> returns Effect<X, never, RunR | DepR> — service requirements now propagate at the type level.

Codex re-review of the fixup confirmed the corrections are effective and no new issues introduced.

Refs #1038.

Test plan

  • bun run typecheck passes locally.
  • All tests under test/tool/ and test/permission/ pass locally.
  • No model-visible rename: bash tool id, schema, and "bash" permission key unchanged.
  • CI green on dev-targeted checks.

Summary by CodeRabbit

  • New Features

    • Added automated artifact capture for bash runs, detecting and reporting produced outputs.
  • Refactor

    • Bash execution now delegates artifact orchestration to a centralized mechanism, simplifying run flow.
  • Bug Fixes

    • Clarified Bash tool docs to state each call runs in a fresh process (no persistent shell state).
  • Documentation

    • Improved command parameter schema and added platform-specific chaining guidance.
  • Tests

    • New test suites covering artifact orchestration and prompt rendering/behavior.

Astro-Han added 2 commits June 1, 2026 20:53
Split the 891-line bash tool into a thinner orchestrator and three
focused modules, keeping the public tool id, schema name, and
permission key as "bash" for backward compatibility.

- tool/bash/prompt.ts owns Parameters, Limits, parameterSchema(),
  render(), and per-shell chaining text for bash / pwsh / powershell
  / cmd. The "persistent shell session" wording in bash.txt is
  corrected to reflect one-shot execution.
- tool/bash/id.ts holds the internal Kind taxonomy and ToolID = "bash"
  alias. The file header makes clear this is not a public rename.
- tool/bash-artifact-orchestrator.ts owns expected_outputs resolution,
  exact OfficeCLI target snapshots, auto-discovery before/after diff,
  and recordWrite / recordUncaptured dispatch. Dependencies are
  injected via a typed ArtifactDeps interface so tests can mock
  individual hooks (e.g. assert discoverOfficeOutputs is not called
  when expected_outputs is present).
- bash.ts execute() becomes a thin pipeline: resolve workdir,
  permission scan + ask, then orchestrateArtifacts(run).
- expected_outputs description is tightened with explicit ONLY/NEVER
  rules and locked behind a constant so future edits can't silently
  re-broaden it.
- Adds bash-prompt.test.ts (chaining text + render snapshots for all
  four shell kinds, including the powershell `cmd1; if ($?) { cmd2 }`
  hint) and bash-artifact-orchestrator.test.ts (declared changed,
  declared unchanged, exact OfficeCLI target, auto-discovery overflow,
  read-only no noise, expected_outputs skips discovery).

bash.ts: 891 -> 696 LOC. Existing 226 bash-related tests pass
unchanged; 145 permission / registry / agent tests pass.

Refs #1038.
…uirements

Two corrections from the first Codex challenge pass.

shellEnv() must run AFTER the orchestrator's before-snapshots. A
`shell.env` plugin may mutate files as a side effect (config drops,
bundled-tool installs); the previous draft called shellEnv() before
orchestrateArtifacts(), which folded any such mutation into "before"
and silently dropped the resulting turn-change record. shellEnv now
runs inside the runner closure so the before-state reads happen
first.

ArtifactDeps had typed every Effect dep as `Effect.Effect<X, never,
any>`. With `any`, real service requirements (InstanceState.context
for assertExternalDirectoryEffect, ChildProcessSpawner for
resolveExecutionPath via cygpath, TurnChange.Service for record*)
never propagated to the orchestrator's return type, so wiring it
into a runtime missing one of those services would crash at run
time instead of failing to typecheck. Replaced with one generic
`DepR` parameter; `ArtifactDeps<DepR>` and `orchestrateArtifacts<
RunR, DepR>` now bubble the union to the caller.

Adds a regression test that asserts the orchestrator yields its
before-snapshot reads before invoking the runner, so future
re-orderings cannot reintroduce the shellEnv timing bug silently.

Refs #1038.
@Astro-Han Astro-Han added task Narrow execution, audit, spike, migration, tracking, or upstream follow-up work tech-debt Supplemental cleanup, maintainability, architecture, test, or quality debt context harness Model harness, prompts, tool descriptions, and session mechanics P1 High priority labels Jun 1, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b13e41da-2aa0-4f2d-a3d3-b6b411c8b510

📥 Commits

Reviewing files that changed from the base of the PR and between fa34fdf and 1d717b3.

📒 Files selected for processing (6)
  • packages/opencode/src/tool/bash-artifact-orchestrator.ts
  • packages/opencode/src/tool/bash.ts
  • packages/opencode/src/tool/bash/id.ts
  • packages/opencode/src/tool/bash/prompt.ts
  • packages/opencode/test/tool/bash-artifact-orchestrator.test.ts
  • packages/opencode/test/tool/bash-prompt.test.ts
💤 Files with no reviewable changes (1)
  • packages/opencode/src/tool/bash/id.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • packages/opencode/src/tool/bash-artifact-orchestrator.ts
  • packages/opencode/src/tool/bash.ts
  • packages/opencode/test/tool/bash-prompt.test.ts
  • packages/opencode/test/tool/bash-artifact-orchestrator.test.ts

📝 Walkthrough

Walkthrough

This PR extracts artifact capture and output-discovery from the Bash tool into a new Effect-based orchestrator, moves prompt/parameter rendering to bash/prompt, adds shell-kind helpers, updates bash.ts to call the orchestrator, clarifies the bash tool description, and adds tests for the orchestrator and prompt modules.

Changes

Bash artifact orchestration refactoring

Layer / File(s) Summary
Artifact orchestration core
packages/opencode/src/tool/bash-artifact-orchestrator.ts
orchestrateArtifacts types and implementation: pre-run tracked-state snapshots, normalization/deduplication of paths, exact Office target and optional auto-discovery, output selection and overflow handling, post-run re-read and change detection, recordWrite/recordUncaptured calls, and return of runner result augmented with metadata.artifacts.
Bash tool refactor and wiring
packages/opencode/src/tool/bash.ts
Import Parameters, renderDescription, Limits, and ToolID; remove in-file Parameters schema; build limits and description, assemble ArtifactDeps, delegate orchestration to orchestrateArtifacts, run shellEnv() after before-snapshots, and invoke the runner with the computed env/timeout/description.
Parameter schema and prompt rendering
packages/opencode/src/tool/bash/prompt.ts, packages/opencode/src/tool/bash.txt
Add parameterSchema()/exported Parameters, EXPECTED_OUTPUTS_DESCRIPTION, Limits, chainingFor(name), and render(...) that interpolates bash.txt placeholders and selects shell-specific chaining guidance.
Shell kind taxonomy
packages/opencode/src/tool/bash/id.ts
Define internal shell kinds, isKind/toKind helpers, and export ToolID constant/type fixed to "bash".
Bash tool documentation clarification
packages/opencode/src/tool/bash.txt
Clarifies that each bash tool call executes a single command in a fresh process and that cwd/env do not persist across calls.
Orchestrator test coverage
packages/opencode/test/tool/bash-artifact-orchestrator.test.ts
Bun tests with a mock ArtifactDeps harness covering declared outputs, unchanged state, exact Office targets, discovery overflow, read-only skipping, pre-run ordering, and discovery suppression when expected outputs exist.
Prompt and shell-kind test coverage
packages/opencode/test/tool/bash-prompt.test.ts
Bun tests for chainingFor differences across shells, render interpolation and placeholder absence, and strict phrasing in EXPECTED_OUTPUTS_DESCRIPTION.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

  • Astro-Han/pawwork#561: Related prior work on expected_outputs pre/post artifact tracking and recordWrite behavior.
  • Astro-Han/pawwork#1037: Overlaps on OfficeCLI auto-discovery and artifact capture/refactoring.
  • Astro-Han/pawwork#331: Related OfficeCLI shellEnv()/OFFICECLI_SKIP_UPDATE handling surfaced in rewritten wiring.

Suggested labels

enhancement, upstream, P2

Poem

🐰 I hopped through code to split each part,
Orchestrated artifacts with tidy art,
Prompts and kinds now leap in line,
Tests keep watch while shells refine.
A tidy refactor, carrot-sweet and smart.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main refactoring effort: splitting the bash tool module into three focused submodules (prompt, id, and artifact orchestrator).
Description check ✅ Passed The description is comprehensive and complete, covering summary, changes, rationale, scope boundaries, verification steps, and review history as required by the template.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/i1038-bash-prompt-orchestrator

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested priority: P2 (includes non-doc, non-test paths outside the low-risk bucket).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the bash tool by extracting artifact orchestration into bash-artifact-orchestrator.ts and prompt-related logic into bash/prompt.ts, accompanied by new unit tests. The feedback suggests removing redundant circular self-exports in id.ts and prompt.ts, simplifying the errorCode retrieval in the orchestrator, and replacing AbortSignal.any([]) with new AbortController().signal in tests to ensure compatibility with older Node.js environments.

Comment thread packages/opencode/src/tool/bash/id.ts Outdated
Comment thread packages/opencode/src/tool/bash/prompt.ts Outdated
Comment thread packages/opencode/src/tool/bash-artifact-orchestrator.ts Outdated
Comment thread packages/opencode/test/tool/bash-artifact-orchestrator.test.ts Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/opencode/src/tool/bash/prompt.ts (1)

21-26: ⚡ Quick win

Use the shared shell taxonomy to avoid literal drift.

chainingFor currently hardcodes shell literals even though packages/opencode/src/tool/bash/id.ts already owns normalization. Routing through toKind keeps prompt branching aligned with a single source of truth.

Proposed refactor
+import { BashID } from "./id"
+
 export function chainingFor(name: string) {
-  if (name === "powershell") return POWERSHELL_CHAIN
-  if (name === "pwsh") return PWSH_CHAIN
-  if (name === "cmd") return CMD_CHAIN
-  return BASH_CHAIN
+  switch (BashID.toKind(name)) {
+    case "powershell":
+      return POWERSHELL_CHAIN
+    case "pwsh":
+      return PWSH_CHAIN
+    case "cmd":
+      return CMD_CHAIN
+    case "bash":
+    default:
+      return BASH_CHAIN
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opencode/src/tool/bash/prompt.ts` around lines 21 - 26, chainingFor
currently compares raw name strings; instead normalize via the shared toKind
function from id.ts before branching so the prompt logic follows the single
source of truth. Update chainingFor to call toKind(name) and switch on its
result, returning POWERSHELL_CHAIN, PWSH_CHAIN, CMD_CHAIN or BASH_CHAIN as
before; reference the toKind symbol and the existing chain constants
(POWERSHELL_CHAIN, PWSH_CHAIN, CMD_CHAIN, BASH_CHAIN) so you avoid literal
drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/opencode/src/tool/bash/prompt.ts`:
- Around line 21-26: chainingFor currently compares raw name strings; instead
normalize via the shared toKind function from id.ts before branching so the
prompt logic follows the single source of truth. Update chainingFor to call
toKind(name) and switch on its result, returning POWERSHELL_CHAIN, PWSH_CHAIN,
CMD_CHAIN or BASH_CHAIN as before; reference the toKind symbol and the existing
chain constants (POWERSHELL_CHAIN, PWSH_CHAIN, CMD_CHAIN, BASH_CHAIN) so you
avoid literal drift.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 315baa62-1be4-4bba-ac19-399f30824ca4

📥 Commits

Reviewing files that changed from the base of the PR and between 973943b and fa34fdf.

📒 Files selected for processing (7)
  • packages/opencode/src/tool/bash-artifact-orchestrator.ts
  • packages/opencode/src/tool/bash.ts
  • packages/opencode/src/tool/bash.txt
  • packages/opencode/src/tool/bash/id.ts
  • packages/opencode/src/tool/bash/prompt.ts
  • packages/opencode/test/tool/bash-artifact-orchestrator.test.ts
  • packages/opencode/test/tool/bash-prompt.test.ts

Astro-Han added 4 commits June 1, 2026 21:41
- drop redundant `export * as BashID` / `BashPrompt` self-namespace
  re-exports; nothing consumes them and bundlers warn on the cycle.
- route `chainingFor` through `toKind` so the shell taxonomy in
  `bash/id` is the single source of truth (no literal drift).
- collapse the `"errorCode" in tracked.before` check to direct optional
  access — `TrackedOutputState.errorCode` is plain `string | undefined`.
- swap `AbortSignal.any([])` for `new AbortController().signal` in the
  orchestrator test for broader Node runtime compat.
The reviewer flagged a conflict: "binary outputs" sat on the affirmative
side while a bare "builds" sat on the DO NOT side, so a build that emits
a specific named binary or report could be misread as out of scope and
slip past turn-change recording (auto-discovery only catches Office
outputs).

Tighten the wording: keep routine compile/install loops in DO NOT, but
explicitly call out that a build emitting a specific named deliverable
counts as a deliverable artifact and must be listed. Pin both halves
with new prompt tests.
unit-windows-opencode-server-tools was failing on the four orchestrator
test cases because the mock state dictionary was keyed by POSIX paths
("/tmp/work/foo.docx") while the orchestrator internally rewrites
inputs through AppFileSystem.normalizePath — on Windows that produces
"D:\\tmp\\work\\foo.docx", so every state lookup missed and the
artifact came back exists:false.

Route the test path constants through the same normalizePath helper so
the keys match on every runner; the call is a no-op on POSIX so the
local pass-rate is unchanged.

The two diffFull timeouts and the loop-finish timeout in the same run
are pre-existing flakes on unrelated paths; leaving them for a
separate follow-up.
bash/id.ts already exports `ToolID = "bash"` as the compatibility-id
single source of truth, but the two literal "bash" sites in bash.ts
were left untouched at extraction time. Replace them so the saved-
permission key and the Tool.define id both trace back to the constant —
future shell-kind work can rename internal identifiers without
risking a literal-vs-constant drift on the public surface.
@Astro-Han Astro-Han merged commit 93f245d into dev Jun 1, 2026
35 checks passed
@Astro-Han Astro-Han deleted the claude/i1038-bash-prompt-orchestrator branch June 1, 2026 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

harness Model harness, prompts, tool descriptions, and session mechanics P1 High priority task Narrow execution, audit, spike, migration, tracking, or upstream follow-up work tech-debt Supplemental cleanup, maintainability, architecture, test, or quality debt context

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant