Skip to content

fix(skills): use command -v instead of which for codex detection#1197

Open
mvanhorn wants to merge 2 commits intogarrytan:mainfrom
mvanhorn:mvanhorn/fix-codex-command-v
Open

fix(skills): use command -v instead of which for codex detection#1197
mvanhorn wants to merge 2 commits intogarrytan:mainfrom
mvanhorn:mvanhorn/fix-codex-command-v

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

Summary

Replace which codex with command -v codex in the Codex availability gate used by /review, /ship, /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /plan-devex-review, /design-consultation, /design-review, and /codex itself.

Why this matters

which is not POSIX. Its behavior varies across platforms:

  • Debian and Ubuntu ship /usr/bin/which from debianutils, which scans PATH
  • Some systems alias which to a bash function that reads the hash table
  • macOS ships a small csh-compatible script
  • Busybox does not ship which at all

In non-interactive shells spawned by tool runners, PATH hydration and hash state can diverge in ways that cause which codex to miss a binary that command -v codex and type codex resolve correctly. The reporter on #1193 hit this on Ubuntu/Pop!_OS with @openai/codex@0.125.0 installed under nvm:

$ which codex 2>&1
/bin/bash: line 1: codex: command not found

$ command -v codex
/home/btallman/.nvm/versions/node/v24.12.0/bin/codex

When the availability probe misses, /review silently falls back to "Claude adversarial only" without telling the user. The reporter approved a PR that way.

command -v is POSIX-specified and uses the same resolution the shell itself uses to run a command, so a positive result means the binary is runnable.

Changes

Source files edited directly:

  • codex/SKILL.md.tmpl line 45 (one callsite in the template used by /codex)
  • scripts/resolvers/review.ts lines 269, 428, 557 (three callsites consumed by /review and /ship)
  • scripts/resolvers/design.ts lines 13, 515, 691 (three callsites consumed by design/review skills)
  • test/skill-validation.test.ts line 1320 (assertion updated to match new literal)
  • test/skill-e2e-plan.test.ts line 779 (assertion updated to match new literal)

Regenerated via bun run gen:skill-docs --host all:

  • codex/SKILL.md, review/SKILL.md, ship/SKILL.md, office-hours/SKILL.md, plan-ceo-review/SKILL.md, plan-eng-review/SKILL.md, plan-design-review/SKILL.md, plan-devex-review/SKILL.md, design-consultation/SKILL.md, design-review/SKILL.md
  • test/fixtures/golden/claude-ship-SKILL.md and test/fixtures/golden/factory-ship-SKILL.md refreshed from regenerated ship output (the codex-ship golden was already byte-identical to the regenerated .agents/ output)

gstack already uses command -v codex in autoplan/SKILL.md.tmpl:250 and test/helpers/providers/gpt.ts:20. This aligns the rest of the codebase with that existing convention.

Not in scope

  • pair-agent/SKILL.md.tmpl:131 and pair-agent/SKILL.md:1177 use the same which ngrok 2>/dev/null shape. The reporter's issue is about codex detection, so ngrok is left alone. Happy to open a follow-up PR for it.
  • test/fixtures/golden-ship-claude.md (top-level, not under golden/) still contains which codex but is not referenced by any test and was last touched in 2026-04 via a community PR. Treated as an orphan artifact.

Testing

bun run gen:skill-docs --dry-run --host all     # exit 0, all files fresh
bun run skill:check                              # 43/43 skills OK, all hosts fresh
bun test test/host-config.test.ts -t "golden"    # 3/3 golden regression tests pass
bun test test/skill-validation.test.ts -t "codex/SKILL.md"  # 12/12 pass

Three unrelated tests fail on both upstream/main and this branch (no compiled binaries > 2MB, read_secret_to_env > rejects invalid var names, Opus 4.7 overlay > Fan out). Verified by running them on upstream/main HEAD before branching.

Fixes #1193

which is not POSIX; its behavior varies across platforms (Debian debianutils,
macOS csh script, bash function alias, Busybox missing entirely). In
non-interactive shells spawned by tool runners, PATH and hash state can
diverge so which codex returns "command not found" for a binary that is on
PATH and runnable. The Codex gate then silently skips with CODEX_NOT_AVAILABLE.

Replace which codex with command -v codex across:
- codex/SKILL.md.tmpl (single source of truth for the codex binary check)
- scripts/resolvers/review.ts (3 callsites)
- scripts/resolvers/design.ts (3 callsites)

command -v is POSIX-specified and uses the shell's own resolution, so if
command -v X returns a path, X is runnable.

Update two test assertions that checked for the literal string "which codex"
in test/skill-validation.test.ts and test/skill-e2e-plan.test.ts.

autoplan/ already uses the command -v codex pattern; this aligns the rest of
the codebase with the existing convention.

Fixes garrytan#1193
bun run gen:skill-docs --host all output following the which -> command -v
swap in codex/SKILL.md.tmpl and scripts/resolvers/*.ts. Refreshed
test/fixtures/golden/claude-ship-SKILL.md and factory-ship-SKILL.md from
the regenerated ship SKILL.md for each host. codex-ship-SKILL.md was
already byte-identical to the regenerated .agents/ output and did not
need a refresh.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

which codex detection in review/ship/office-hours/plan-ceo-review/design skills unreliable — use command -v instead

1 participant