maestrode

Cuts your Claude usage by ~80%. Smart model thinks, cheap model writes. Measured on real tasks, no quality loss on spec-driven work.

your brain	Claude tokens per task	Claude usage cut
Claude Opus 4.7	~30k → ~4k	~87%
Claude Sonnet 4.7	~30k → ~5k	~83%

If you're on a subscription tier, this means more work fits inside the same quota (or you drop a tier and stop paying for headroom you don't need). If you're on API billing, the same percentage shows up as a smaller invoice.

Claude (or any expensive frontier model) plans, reads, decides, reviews, iterates. DeepSeek V4 Flash (or any OpenAI-compatible cheap model) writes the actual file contents. A 300-line bash shim handles delegation, structured failure feedback, multi-turn sessions, multi-file output, and KV-cache reuse.

Why I built this

I was on Claude Max 20x but never hit the ceiling. The price felt steep for headroom I wasn't using.

Claude does two jobs per turn: thinking and writing. Thinking is what frontier models are uniquely good at. Writing is interchangeable. So I wired Claude as the brain and DeepSeek V4 Flash as the muscle, benchmarked it 10 ways, and confirmed quality holds on spec-driven work. Where flash falls down, the brief was sloppy, not the model. Back on the smaller tier now, the muscle calls are basically free.

This repo is the wrapper + skill that made it work, plus the measurements.

Quickstart

Install on macOS / Linux / WSL:

curl -fsSL https://raw.githubusercontent.com/doedja/maestrode/main/install.sh | bash

Install on Windows (PowerShell, no admin needed):

iwr -useb https://raw.githubusercontent.com/doedja/maestrode/main/install.ps1 | iex

Windows prereqs: Git for Windows (winget install Git.Git, provides bash) and Python 3 (winget install Python.Python.3.12). The installer drops a maestrode.cmd shim so the command runs from cmd / PowerShell / Windows Terminal without needing to open Git Bash.

Edit the env file with your API key:

# macOS / Linux / WSL / Git Bash
nvim ~/.config/maestrode/env
# MAESTRODE_API_KEY=sk-...
# MAESTRODE_ENDPOINT=https://api.deepseek.com/v1/chat/completions

# Windows PowerShell
notepad $env:USERPROFILE\.config\maestrode\env

Uninstall (removes binary + config + sessions):

# macOS / Linux / WSL
curl -fsSL https://raw.githubusercontent.com/doedja/maestrode/main/install.sh | bash -s -- --uninstall
# add --keep-config to keep ~/.config/maestrode

# Windows
& ([scriptblock]::Create((iwr -useb https://raw.githubusercontent.com/doedja/maestrode/main/install.ps1).Content)) -Uninstall
# add -KeepConfig to keep $env:USERPROFILE\.config\maestrode

If you use Claude Code

Drop the skill in:

mkdir -p ~/.claude/skills/maestrode
cp skill/maestrode.md ~/.claude/skills/maestrode/SKILL.md

Then in any Claude Code session, say maestrode on (or /maestrode). The skill activates and Claude starts delegating code-writing to the cheap muscle for the rest of the session.

If you use the CLI directly

maestrode "write a function that validates an email"
./examples/quickstart.sh    # runnable end-to-end demo, ~30 seconds

Where to get the cheap model

If you don't already have a key, OpenCode Go is the cheapest path I've found. $10/mo gets you ~$60 of DeepSeek V4 Flash + Pro + a bunch of other open-source models (GLM, Kimi, Qwen, MiniMax). That's what I used for all 10 benchmarks. (That's my referral link, fair warning. The non-ref URL is opencode.ai/go.)

Or any OpenAI-compatible endpoint works:

provider	endpoint
DeepSeek direct	`https://api.deepseek.com/v1/chat/completions`
OpenRouter	`https://openrouter.ai/api/v1/chat/completions`
OpenCode Zen Go	`https://opencode.ai/zen/go/v1/chat/completions`
ollama (local)	`http://localhost:11434/v1/chat/completions`

Set MAESTRODE_API_KEY and MAESTRODE_ENDPOINT in ~/.config/maestrode/env.

When to use what

task shape	use
clear spec, multi-file, well-defined	maestrode (default)
vague brief, ambiguous diagnosis	smart model directly, skip the cheap muscle
cross-cutting refactor with API constraints	maestrode but review the muscle output

Pairs well with

rtk (brew install rtk): rewrites your ls, git, gh, tree, read commands into token-compact output. Different layer than maestrode. Stacks cleanly. Not affiliated.

Big jobs

For multi-module builds, split the brief and dispatch in parallel:

maestrode --session p-api --files build/api "<api brief>" &
maestrode --session p-db  --files build/db  "<db brief>"  &
maestrode --session p-web --files build/web "<web brief>" &
wait

Each call has its own session (independent cache) and its own output dir. Three parallel ~30s calls finish in ~30s wall, not ~90s, and each stays under max_tokens. Defaults: MAESTRODE_MAX_TOKENS=65536, MAESTRODE_CURL_TIMEOUT=600. Raise both via env if your endpoint supports more.

Feedback the shim gives you

finish=length in the stderr stat line plus response cut by max_tokens warning when the model hit the cap. Raise --max-tokens (or split the brief).
N <<<FILE:>>> block(s) opened but not closed when --files parsing finds a truncated response. Closed blocks still get written; reissue the missing ones.
Exit 5 (NEEDS_SMART) when the cheap muscle decides the brief is too thin or the task exceeds its capability. Session + file writes are skipped. The brain takes over.

Teach the muscle the escalation contract via --system:

If the brief is ambiguous, missing key details, or beyond your capability,
emit `<<<NEEDS_SMART: <one-line reason>>>>` as the very first line and stop.

Caveats

Cheap model hallucinates on vague briefs. Use the smart model directly for ambiguous diagnosis, or paste the actual assertion text when iterating.
No streaming yet. Responses come whole.

License

MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
examples		examples
skill		skill
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
install.ps1		install.ps1
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

maestrode

Why I built this

Quickstart

If you use Claude Code

If you use the CLI directly

Where to get the cheap model

When to use what

Pairs well with

Big jobs

Feedback the shim gives you

Caveats

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

maestrode

Why I built this

Quickstart

If you use Claude Code

If you use the CLI directly

Where to get the cheap model

When to use what

Pairs well with

Big jobs

Feedback the shim gives you

Caveats

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages