Goal
Make CodeWhale's harness strategy explicit per provider/model route instead of assuming every model wants the same amount of up-front system context.
This came out of v0.8.53 testing: DeepSeek V4 and Xiaomi MiMo v2.5 appear likely to benefit from a cache-heavy/prefix-stable starting prompt, while many other routes may do better with a lean root context, immediate exploration, subagents, and compact handoff packets.
This is related to #2672, but this issue is specifically about runtime harness posture: how much context to load up front, when to encourage subagents, when to compact, and when to start fresh from a handoff.
Current Shape
- Plan mode already has a read-only design-first prompt in
crates/tui/src/prompts/modes/plan.md.
- Plan mode currently resolves via
update_plan and opens an approval prompt from crates/tui/src/tui/ui.rs / crates/tui/src/tui/plan_prompt.rs.
- Tool exposure is already mode-sensitive in
crates/tui/src/core/engine/turn_loop.rs.
- Provider/model routing work is happening around the config/provider registry, but there is not yet a first-class policy object that says how a given route should use context, subagents, and handoffs.
Proposal
Add a central HarnessPosture policy chosen from provider + model + user override.
Possible initial values:
PrefixCached: stable, rich system/context prefix; prefer cache byte stability; minimal churn. Good default candidates: DeepSeek V4, Xiaomi MiMo v2.5.
LeanRootExplore: minimal starting prompt; strongly prefer quick repo orientation, subagent exploration, and on-demand docs/skills.
PlanHandoffReset: plan in one context, then launch implementation from a compact approved handoff packet in a fresh context.
SmallContextLocal: aggressive summarization, narrow tool surface, small prompts, cheap parallel probes.
The posture should affect:
- System prompt assembly and context injection volume.
- Whether Plan mode recommends handoff reset by default.
- Subagent encouragement and default delegation copy.
- Context compaction thresholds.
- Which docs/skills are eagerly injected vs available on demand.
- Telemetry labels so we can evaluate posture choice rather than argue from vibes.
Acceptance Criteria
- A single registry/policy layer maps provider+model route to default
HarnessPosture; no scattered provider-name conditionals.
- User config can override posture without changing provider/model identity.
- Tests assert at least:
- DeepSeek V4 and MiMo v2.5 select a cache-heavy posture by default.
- Generic OpenAI-compatible/OpenRouter/local routes can select a lean or handoff posture.
- Prompt/tool catalog byte stability remains protected for prefix-cached routes.
- Docs explain provider vs model vs harness posture as separate concepts.
- Telemetry/logging records posture for later outcome analysis.
Non-goals
- Do not hardcode benchmark-specific behavior.
- Do not claim one posture is globally better until we have eval data.
- Do not remove the existing global identity/constitution preamble as part of this work.
Goal
Make CodeWhale's harness strategy explicit per provider/model route instead of assuming every model wants the same amount of up-front system context.
This came out of v0.8.53 testing: DeepSeek V4 and Xiaomi MiMo v2.5 appear likely to benefit from a cache-heavy/prefix-stable starting prompt, while many other routes may do better with a lean root context, immediate exploration, subagents, and compact handoff packets.
This is related to #2672, but this issue is specifically about runtime harness posture: how much context to load up front, when to encourage subagents, when to compact, and when to start fresh from a handoff.
Current Shape
crates/tui/src/prompts/modes/plan.md.update_planand opens an approval prompt fromcrates/tui/src/tui/ui.rs/crates/tui/src/tui/plan_prompt.rs.crates/tui/src/core/engine/turn_loop.rs.Proposal
Add a central
HarnessPosturepolicy chosen from provider + model + user override.Possible initial values:
PrefixCached: stable, rich system/context prefix; prefer cache byte stability; minimal churn. Good default candidates: DeepSeek V4, Xiaomi MiMo v2.5.LeanRootExplore: minimal starting prompt; strongly prefer quick repo orientation, subagent exploration, and on-demand docs/skills.PlanHandoffReset: plan in one context, then launch implementation from a compact approved handoff packet in a fresh context.SmallContextLocal: aggressive summarization, narrow tool surface, small prompts, cheap parallel probes.The posture should affect:
Acceptance Criteria
HarnessPosture; no scattered provider-name conditionals.Non-goals