Skip to content

Latest commit

 

History

History
307 lines (238 loc) · 10.6 KB

File metadata and controls

307 lines (238 loc) · 10.6 KB
name codex
description Use OpenAI Codex CLI as a secondary AI agent for parallel task execution. Delegate tasks to Codex proactively whenever you identify independent work that can run in parallel — code review, test writing, refactoring, documentation, boilerplate, linting, or any isolated task. Use for second opinions, web search, or when multiple subtasks exist. Trigger on "use codex", "delegate", "second opinion", "use GPT", "parallel agent", or autonomously when you see parallelizable work. Also use `codex exec review` automatically after completing PRs or major code chunks. When stuck on a bug, delegate to Codex for a fresh perspective.

Codex — Secondary Agent

Prerequisites

This skill requires:

  1. Codex CLI installed: npm install -g @openai/codex
  2. Logged in with your own OpenAI account: codex login (uses YOUR subscription — no tokens are included or shared)
  3. Optional config: ~/.codex/config.toml for custom defaults

This skill contains NO credentials. Each user must authenticate with their own OpenAI account.


You have access to OpenAI Codex CLI as a secondary agent. Codex runs in parallel in the same project directory. Global config in ~/.codex/config.toml.

When to delegate

Decide autonomously.

Good candidates: tests, review (codex exec review), docs, boilerplate, isolated refactoring, web search (--search), collaborative debugging, second opinion.

Don't delegate: same files you're working on, tasks depending on your in-progress work, tasks where the user wants direct interaction with you.

Model priority:

  • Critical (critical path) → gpt-5.4 (default, omit -m)
  • Secondary (docs, lint, cleanup) → -m gpt-5
  • Throwaway (quick queries) → -m gpt-5 --ephemeral

Note: Available models depend on the user's OpenAI subscription. The models above were tested with a ChatGPT Plus/Pro subscription. API users may have access to different models (o3, o4-mini, gpt-4.1, etc.). Run codex exec --full-auto -m <model> "test" to verify which models work with your account.

Shared memory — Vault (optional)

If ctxvault is installed, use it as a bidirectional knowledge bridge.

Before launching Codex — search for relevant context and inject it into the prompt:

VAULT_CONTEXT=$(ctxvault query <vault-name> "task topic" 2>/dev/null | head -30)
CONVENTIONS=$(head -50 CLAUDE.md 2>/dev/null || echo "")

After Codex finishes — save useful discoveries to the vault:

  • debug-* — Non-obvious bugs resolved
  • lesson-* — Approaches that worked (or didn't)
  • reference-* — Docs/APIs discovered with --search
  • decision-* — Architectural decisions

If ctxvault is not installed, skip this section. Inject CLAUDE.md conventions directly into the prompt instead.

File lock

Register files assigned to Codex in /tmp/codex-locks.json before launching:

python3 -c "
import json, os
f='/tmp/codex-locks.json'
d=json.load(open(f)) if os.path.exists(f) else {'locks':[]}
d['locks'].append({'task_id':'codex-TIMESTAMP','files':['file1.ts']})
json.dump(d,open(f,'w'),indent=2)"

Check before modifying a file: cat /tmp/codex-locks.json 2>/dev/null Release the lock when Codex finishes.

Pre-launch safety checks

Git safety

Before launching Codex, verify no uncommitted changes exist on target files:

git diff --name-only | grep -E "file1.ts|file2.ts"

If uncommitted changes exist on those files, commit or stash first. Codex could overwrite them.

Instance limit

Max 3 parallel Codex instances. Check before launching:

ACTIVE=$(pgrep -fc "codex exec" 2>/dev/null || echo 0)

If >= 3, wait for one to finish. This protects the subscription and system resources.

Launching Codex

Base command

codex exec --full-auto -C "$PWD" \
  --json -o "/tmp/codex-$(date +%s%N).md" \
  "<prompt with vault context + conventions>"

Always launch with run_in_background: true and timeout: 300000 (5 min).

Timeout

Use timeout: 300000 (5 min) as default. For complex tasks (large refactoring, big codebase review), increase to timeout: 600000 (10 min). If Codex doesn't respond within timeout, the process is killed — treat as error and follow error handling.

Flag reference

Flag When
-m gpt-5 Simple tasks (default is gpt-5.4)
--search Prompt requires web info
--ephemeral Throwaway task, no session saved
-i <file> Screenshots, diagrams, mockups
--output-schema <file> Force structured JSON output
--add-dir <dir> Extra writable directories (monorepo)
--skip-git-repo-check Directory without git repo

Prompt templates

Use these for common tasks. Every prompt should include project conventions.

Tests:

CONVENTIONS: $CONVENTIONS
TASK: Write unit tests for {file}. Create {test_file} using {framework}.
Cover: happy path, edge cases, errors. Do NOT modify other files.

Review:

codex exec --full-auto -C "$PWD" --json -o "/tmp/codex-review-$(date +%s%N).md" review

Docs:

CONVENTIONS: $CONVENTIONS
TASK: Generate documentation for {files}. Use {format: JSDoc/docstring/README}.
Do NOT modify logic, only comments/docs.

Refactoring:

CONVENTIONS: $CONVENTIONS
TASK: Refactor {file} — {goal: extract function/rename/simplify}.
CONSTRAINTS: keep public API unchanged, don't break imports.
FILES TO MODIFY: {exact list}
FILES TO NOT TOUCH: {exact list}

Collaborative debugging:

BUG: {description}
ERROR: {full error message}
FILES INVOLVED: {list}
WHAT I ALREADY TRIED:
- {attempt 1}
- {attempt 2}
Analyze from a different perspective and suggest a fix.

Web search:

codex exec --full-auto --search -C "$PWD" \
  --json -o "/tmp/codex-research-$(date +%s%N).md" \
  "Search {topic}: official docs, best practices, examples."

Image input

codex exec --full-auto -C "$PWD" \
  -i /tmp/screenshot.png \
  --json -o "/tmp/codex-visual-$(date +%s%N).md" \
  "Analyze this screenshot and suggest..."

Output schema

cat > /tmp/schema.json << 'EOF'
{"type":"object","properties":{"issues":{"type":"array","items":{"type":"object",
"properties":{"file":{"type":"string"},"severity":{"enum":["critical","warning","info"]},
"description":{"type":"string"}}}},"summary":{"type":"string"}}}
EOF
codex exec --full-auto -C "$PWD" --output-schema /tmp/schema.json \
  --json -o "/tmp/codex-structured-$(date +%s%N).md" "Review src/api/..."

Add-dir (monorepo)

codex exec --full-auto -C "$PWD" \
  --add-dir ../shared-lib --add-dir ../common-types \
  --json -o "/tmp/codex-mono-$(date +%s%N).md" \
  "Update types in ../common-types/index.ts..."

Multiple instances

You can launch multiple Codex in parallel. Each instance: own -o, own files, own lock.

Session resume

If a Codex task was interrupted or needs follow-up:

# Resume most recent session
codex exec resume --last --full-auto \
  --json -o "/tmp/codex-resume-$(date +%s%N).md" \
  "Continue previous work. Also: {new context or corrections}"

# Resume specific session by ID
codex exec resume <session-id> --full-auto \
  --json -o "/tmp/codex-resume-$(date +%s%N).md" \
  "The test you wrote fails with: {error}. Fix it."

Use resume when:

  • Codex did good work but needs adjustment
  • Task was too long and got interrupted
  • You want to give feedback and have it correct

Chained tasks — pipeline

When a task depends on another's output, create a pipeline:

Task A (Codex): "Write src/validators/email.ts"
    ↓ (A finishes, read output)
Task B (Codex): "Write tests for src/validators/email.ts.
                 Here's the code: {file content from A}"
    ↓ (B finishes, read output)
Task C (Codex): review changes

Launch each step with run_in_background, wait for notification, then launch next. Don't launch dependent tasks in parallel.

Cross-review

After both you and Codex finish work, launch a cross-review:

codex exec --full-auto -C "$PWD" \
  --json -o "/tmp/codex-crossreview-$(date +%s%N).md" \
  "Review the following files I (Claude Code) wrote.
   Look for bugs, security issues, missing edge cases, and suggest improvements.
   FILES: {list of files you wrote}"

Meanwhile, review Codex's code with git diff. This cross-check improves overall quality.

Token tracking

After each Codex task, token count appears in output (tokens used: N):

grep -o 'tokens used[^0-9]*[0-9,]*' /tmp/codex-*.md 2>/dev/null

Report consumption to the user if significant or if asked.

Checking results

When a Codex task finishes:

  1. Read outputcat /tmp/codex-*.md
  2. Auto-validate — run lint/test only on files Codex touched
  3. Release lock from /tmp/codex-locks.json
  4. Save to vault if Codex produced useful knowledge
  5. Report to user — what Codex did, quality assessment, fixes applied

Error handling

  1. Analyze — Read -o output and JSONL
  2. Resume — If partial, use codex exec resume --last with corrections
  3. Retry — Reformulate prompt with more context. Max 2 retries
  4. Switch model — gpt-5 ↔ gpt-5.4
  5. Fallback — After 2 retries, do it yourself
  6. Save lesson — Instructive errors → vault as lesson-*

Complete flow

User: "Add email validation and tests"

You (Claude Code):                    Codex (background):
├─ Query vault for context
├─ Git safety check
├─ Lock email.test.ts
├─ Inject vault + CLAUDE.md
├─ Launch Codex ────────────────→  Task: write tests
├─ Implement validation               ├─ Writes tests
│  in src/validators/email.ts         │  in email.test.ts
├─ Continue other work                ├─ Finishes
├─ Notification ←─────────────────────┘
├─ Read output + validate (lint/test)
├─ Release lock
├─ Cross-review: Codex reviews ──→  Reviews your code
│  + you review Codex's code        ├─ Produces feedback
├─ Feedback ←─────────────────────┘
├─ Adjust if needed
├─ Save discoveries to vault
├─ Track tokens used
└─ Report to user

Reference

  • Global config: ~/.codex/config.toml
  • Auth: OpenAI subscription via codex login (no API key needed)
  • --full-auto = autonomous with workspace-write sandbox
  • -o = summary, --json = full JSONL output
  • --ephemeral = no session saved to disk
  • -i = attach images, --output-schema = structured JSON
  • --add-dir = extra writable directories
  • codex exec resume = continue previous session
  • codex exec review = native code review