Skip to content

fix: capture undeclared Office bash outputs#1037

Merged
Astro-Han merged 12 commits into
devfrom
codex/i1014-office-artifact-capture
Jun 1, 2026
Merged

fix: capture undeclared Office bash outputs#1037
Astro-Han merged 12 commits into
devfrom
codex/i1014-office-artifact-capture

Conversation

@Astro-Han
Copy link
Copy Markdown
Owner

@Astro-Han Astro-Han commented May 31, 2026

Summary

Capture undeclared Office outputs created by Bash/OfficeCLI through the existing turn-change path.

  • Add bounded Bash output auto-discovery for .docx, .xlsx, and .pptx when expected_outputs is omitted.
  • Keep declared expected_outputs as the highest-confidence path.
  • Preserve opaque binary/large file states in turn-change so modified Office files remain visible.
  • Narrow officecli write detection to mutating subcommands and keep read-only commands quiet.

Why

OfficeCLI can create or modify Office files through Bash without declaring expected_outputs, which made those deliverables invisible to last-turn changes and session artifacts. The fix keeps Bash as the model-facing tool and registers discovered Office outputs through turnChange.recordWrite, while preserving the existing uncaptured marker for any unbounded side effects.

Related Issue

Closes #1014.

Human Review Status

Pending

Review Focus

Please focus on the Bash auto-discovery boundaries: likely-write gating, traversal budgets, ignored paths, workdir scoping, mixed uncaptured behavior, and binary/non-expandable Office display.

Risk Notes

Auto-discovery is intentionally conservative. Large directories, ignored paths, external paths reached only by command-internal cd, or over-budget scans degrade to the existing uncaptured marker instead of guaranteeing capture.

No visible UI or copy changed, so no screenshot or recording was needed.

How To Verify

Bash/tool regression tests: 193 passed
Command: bun test test/tool/bash.test.ts test/tool/bash-write-heuristic.test.ts test/session/turn-change-aggregate.test.ts

Typecheck: passed
Command: bun run typecheck

Diff check: no whitespace errors
Command: git diff --check origin/dev...HEAD

Review loop:
- xhigh first pass found one Office modified-file visibility P1 and three P2 issues; fixed.
- Claude first pass flagged the opaque binary turn-change state issue; fixed.
- DeepSeek V4 Pro first pass flagged the same opaque-state issue plus bounded-read/budget nits; fixed.
- xhigh second pass found no P0/P1/P2/P3 findings.

Screenshots or Recordings

Not applicable; no visible UI changes.

Checklist

How to use this checklist:

  • Tick a box by replacing [ ] with [x]. Do not edit, add, or remove items.
  • The bot-applied label items can only be honestly ticked AFTER the PR is opened and the labeler / priority-triage bots have run — return to the PR description and tick them then.
  • Most items are required. The few that are conditional are explicitly marked (conditional); for those, leave unticked if they truly do not apply and explain why in Risk Notes. All other items must be ticked before requesting human review.
  • Type label — this PR carries exactly one of bug, enhancement, task, documentation. Type labels are author-added; the labeler bot does NOT assign them. Add the label in the GitHub UI, then tick this.
  • Routing labels — this PR carries at least one of app, ui, platform, harness, ci. The labeler bot assigns these on PR open based on changed paths. Confirm the bot's choice (or override if wrong), then tick this.
  • Priority label — this PR carries exactly one of P0, P1, P2, P3. The priority-triage bot suggests one on PR open. Confirm or override, then tick this.
  • Human Review Status above is set to Pending, Approved by @<reviewer>, or Not required: <reason> (default is Pending; "not required" is restricted to bot-authored low-risk PRs).
  • I linked the related issue, or stated in Summary why there is no issue.
  • I described the review focus and any meaningful risks.
  • I replaced the example block in How To Verify with the real verification steps and the key result for each.
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope.
  • (conditional) I manually checked visible UI or copy changes when needed, with screenshots or recordings. Leave unticked only if no visible UI or copy changed.
  • (conditional) I considered macOS and Windows impact for platform, packaging, updater, signing, paths, shell, or permissions changes. Leave unticked only if no platform/packaging surface was touched.
  • (conditional) I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant. Leave unticked only if none of those surfaces was touched.
  • I reviewed the final diff for unrelated changes and suspicious dependency changes.
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English.

Summary by CodeRabbit

  • New Features

    • Automatic detection and tracking of Office document outputs from shell commands without requiring explicit declarations
    • Extended recognition of Office CLI write operations
  • Improvements

    • Enhanced state consistency for non-restorable files
    • Performance safeguards for automatic output discovery

@Astro-Han Astro-Han added bug Something isn't working P2 Medium priority harness Model harness, prompts, tool descriptions, and session mechanics labels May 31, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 31, 2026

Review Change Stack

Warning

Review limit reached

@Astro-Han, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 44 minutes and 9 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: c7788600-e1ba-42f8-bd96-9b8796caf651

📥 Commits

Reviewing files that changed from the base of the PR and between a5c7b7d and 97e621c.

📒 Files selected for processing (7)
  • packages/opencode/src/tool/bash-office-artifacts.ts
  • packages/opencode/src/tool/bash-output-capture.ts
  • packages/opencode/src/tool/bash-write-heuristic.ts
  • packages/opencode/src/tool/bash.ts
  • packages/opencode/test/tool/bash-office-artifacts.test.ts
  • packages/opencode/test/tool/bash-write-heuristic.test.ts
  • packages/opencode/test/tool/bash.test.ts
📝 Walkthrough

Walkthrough

This PR implements automatic discovery and visibility of Office file outputs (.docx, .xlsx, .pptx) created by OfficeCLI commands in the bash tool. When expected_outputs is not provided but the command is identified as write-like, the system recursively scans for new/modified Office files, tracks them as binary artifacts, and includes them in turn-change records.

Changes

Office Output Auto-Discovery

Layer / File(s) Summary
OfficeCLI write command detection
packages/opencode/src/tool/bash-write-heuristic.ts, packages/opencode/test/tool/bash-write-heuristic.test.ts
isLikelyWriteCommand is extended with an officeCliWriteCommands allowlist and isOfficeCli helper. When the parsed command head is officecli* and the next segment matches an allowed subcommand (create, add, set, import, close), the function returns true. Test cases validate that document-manipulation subcommands are recognized as writes, while informational subcommands (version, help, view, get, query, validate, batch) are excluded.
Office discovery infrastructure and binary tracking
packages/opencode/src/tool/bash.ts (constants, helpers, discoverOfficeOutputs, readTrackedState)
Office extension allowlist and AUTO_DISCOVERY_BUDGET (by depth/time/dirs/files/captures) are defined. Helper functions normalize discovery paths and check for Office file extensions. discoverOfficeOutputs recursively scans from workdir respecting ignore rules and budget, returning deduped sorted paths plus an overflowed flag. readTrackedState is updated to hash Office outputs as binary content (or apply a "large" marker when oversized) instead of generic tracked-output hashing.
Auto-discovery in command execution pipeline
packages/opencode/src/tool/bash.ts (execution flow)
trackedOutputs resolution runs with concurrency: 4. When no expected_outputs are supplied, a shouldAutoDiscoverOutputs condition (requires ctx.messageID and write-like heuristic) triggers pre-computed discovery; overflow marks uncaptured and returns early. After command execution, outputs are re-discovered post-command, and artifacts are recomputed for the chosen set. Visible artifacts are filtered to only changed items when auto-discovery is active; if no changes exist, an uncaptured turn-change is recorded before returning.
State restorable bookkeeping
packages/opencode/src/session/turn-change.ts
prepareState adds a guard marking states as non-restorable when content is undefined but non-restorable indicators (hash, restorable === false, binary, large) are already set, preventing incorrect restoration during bookkeeping.
Test coverage for auto-discovery behavior
packages/opencode/test/tool/bash.test.ts
Extensive test cases validate that Office files are auto-discovered and recorded as added/modified when written without expected_outputs, persist alongside uncaptured markers in mixed-output scenarios, use the resolved workdir as the traversal root, are excluded when under ignored paths (e.g., node_modules), produce uncaptured unions when budget is exhausted, and that read-only OfficeCLI commands do not record artifacts or uncaptured markers.

Sequence Diagram

sequenceDiagram
  participant Ctx as PawWork Context
  participant BashTool as Bash Tool
  participant WriteHeur as Write Heuristic
  participant Discover as Office Discovery
  participant FileIO as File I/O
  participant TurnChange as Turn-Change Recorder
  
  Ctx->>BashTool: execute officecli command
  BashTool->>WriteHeur: isLikelyWriteCommand(cmd)
  WriteHeur-->>BashTool: true
  BashTool->>Discover: auto-discover if messageID + write-like
  Discover->>FileIO: scan workdir for .docx/.xlsx/.pptx
  FileIO-->>Discover: file list (deduped, sorted)
  Discover-->>BashTool: paths + overflowed flag
  BashTool->>FileIO: run command
  FileIO-->>BashTool: execution complete
  BashTool->>FileIO: re-discover Office outputs post-command
  BashTool->>FileIO: read tracked state (binary hash/large marker)
  FileIO-->>BashTool: pre/post file states
  BashTool->>TurnChange: record write changes (only changed artifacts)
  TurnChange-->>Ctx: visible artifacts list
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • Astro-Han/pawwork#561: Modifies bash.ts output tracking pipeline with pre/post state hashing and turn-change recording; main PR extends this by adding Office auto-discovery when expected_outputs is omitted.
  • Astro-Han/pawwork#408: Introduces TurnChange tracking and undo-redo system; main PR updates prepareState restorable bookkeeping and integrates tracked write changes into bash.ts execution flow.
  • Astro-Han/pawwork#331: Sets OFFICECLI_SKIP_UPDATE=1 during OfficeCLI bundling; main PR coordinates this behavior in bash.ts when executing OfficeCLI commands.

Poem

🐰 A rabbit hops through Office doors,
Discovering files on forest floors.
No longer lost, these .docx treasures gleam—
Auto-tracked, they join the session dream!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: capturing Office Bash outputs that are not explicitly declared, which aligns with the changeset's primary focus on auto-discovery of Office files.
Description check ✅ Passed The PR description is comprehensive and follows the template structure with clear Summary, Why, Related Issue, Human Review Status, Review Focus, Risk Notes, and How To Verify sections, with all checklist items completed appropriately.
Linked Issues check ✅ Passed The code changes directly address issue #1014 by implementing bounded auto-discovery of Office outputs, preserving opaque binary states, narrowing officecli detection, and integrating with existing turnChange.recordWrite machinery.
Out of Scope Changes check ✅ Passed All changes are directly related to the linked issue and PR objectives; no unrelated refactors, dependency changes, or out-of-scope modifications are present in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/i1014-office-artifact-capture

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested priority: P2 (includes non-doc, non-test paths outside the low-risk bucket).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces auto-discovery of undeclared Microsoft Office outputs (.docx, .xlsx, .pptx) for write-like commands in the bash tool, including support for officecli commands. It also updates state preparation for non-restorable files and adds concurrency limits when reading tracked outputs. The review feedback correctly identifies that the file ignore matching in discoverOfficeOutputs uses the command's working directory instead of the project root, which can lead to incorrect ignore behavior in subdirectories, and provides suggestions to pass the project root to fix this.

Comment thread packages/opencode/src/tool/bash.ts Outdated
Comment thread packages/opencode/src/tool/bash.ts Outdated
Comment thread packages/opencode/src/tool/bash.ts Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/opencode/test/tool/bash-write-heuristic.test.ts (1)

60-64: ⚡ Quick win

Cover the officecli.exe branch explicitly.

isOfficeCli() now has a separate officecli.exe path, but these tables only exercise officecli. A Windows regression there would pass this suite unnoticed.

➕ Minimal coverage addition
   "officecli create report.docx",
+  "officecli.exe create report.docx",
   "officecli add report.pptx / --type slide",
@@
   "officecli --version",
+  "officecli.exe --version",
   "officecli help docx",

Also applies to: 122-128

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opencode/test/tool/bash-write-heuristic.test.ts` around lines 60 -
64, The test only exercises the "officecli" branch; update the
bash-write-heuristic.test.ts tests that build the command lists to also include
explicit "officecli.exe" variants (e.g., duplicate the commands "officecli
create report.docx", "officecli add ...", etc. with "officecli.exe" instead) so
the isOfficeCli() windows path is covered; make the same addition for the other
command table later in the file that mirrors lines 122-128 so both branches run.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/opencode/src/tool/bash.ts`:
- Around line 1033-1045: The code currently marks every autoDiscovered run as
uncaptured; compute a boolean like hasUncapturedOutputs (e.g. const
hasUncapturedOutputs = autoDiscovered && artifacts.some(item => !item.changed)
or compare visibleArtifacts.length !== artifacts.length) and use that when
calling turnChange.recordUncaptured instead of plain autoDiscovered; keep the
early return for visibleArtifacts.length === 0 as-is but replace the later if
(autoDiscovered) { yield* turnChange.recordUncaptured(...) } with if
(hasUncapturedOutputs) { yield* turnChange.recordUncaptured({ sessionID:
ctx.sessionID, messageID: ctx.messageID }) } so only runs with genuinely
partial/uncaptured outputs are marked.

---

Nitpick comments:
In `@packages/opencode/test/tool/bash-write-heuristic.test.ts`:
- Around line 60-64: The test only exercises the "officecli" branch; update the
bash-write-heuristic.test.ts tests that build the command lists to also include
explicit "officecli.exe" variants (e.g., duplicate the commands "officecli
create report.docx", "officecli add ...", etc. with "officecli.exe" instead) so
the isOfficeCli() windows path is covered; make the same addition for the other
command table later in the file that mirrors lines 122-128 so both branches run.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b51e99a0-df49-42f8-95b4-cb3863f9063d

📥 Commits

Reviewing files that changed from the base of the PR and between 05d8ba1 and a5c7b7d.

📒 Files selected for processing (5)
  • packages/opencode/src/session/turn-change.ts
  • packages/opencode/src/tool/bash-write-heuristic.ts
  • packages/opencode/src/tool/bash.ts
  • packages/opencode/test/tool/bash-write-heuristic.test.ts
  • packages/opencode/test/tool/bash.test.ts

Comment thread packages/opencode/src/tool/bash.ts Outdated
@Astro-Han Astro-Han merged commit e74cc18 into dev Jun 1, 2026
38 of 39 checks passed
@Astro-Han Astro-Han deleted the codex/i1014-office-artifact-capture branch June 1, 2026 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working harness Model harness, prompts, tool descriptions, and session mechanics P2 Medium priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OfficeCLI outputs are not registered as visible session artifacts

1 participant