Summary
Design and execute a comprehensive A/B test to measure the impact of CacheBro (MCP-based file caching) on AI coding agent performance across three major CLI tools: OpenCode, Claude Code, and Codex CLI.
Background
CacheBro is an MCP server that caches file reads for AI coding agents, returning diffs or "unchanged" confirmations instead of full file content. The project claims ~26% token savings.
CacheBro Architecture:
- Language: Rust (v0.2.3)
- Protocol: MCP (Model Context Protocol)
- Database: SQLite with SHA-256 hash-based change detection
- Integration: Drop-in MCP server for Claude Code, Cursor, OpenCode
Objectives
- Validate the claimed 26% token savings across different CLI tools
- Measure cost reduction impact
- Assess cache hit rates for different coding task types
- Ensure quality preservation (no degradation in task completion)
- Identify tool-specific benefits from file caching
Test Design
Issues Selection
- 30 real issues from Terraphim repositories
- 10 bug fixes, 10 feature implementations, 10 refactoring tasks
- Sources: terraphim-ai, terraphim-skills, gitea, openclaw-workspace
Experimental Conditions
- Control: Agents without CacheBro (baseline file reading)
- Treatment: Agents with CacheBro MCP server enabled
- Tools: OpenCode, Claude Code, Codex CLI
- Total Runs: 30 issues × 3 tools × 2 conditions = 180 runs
Key Metrics
| Metric |
Target |
Measurement |
| Token Savings |
26% |
Input + Output tokens per session |
| Cost Reduction |
26% |
API cost per issue |
| Cache Hit Rate |
60% |
Cache hits / Total file reads |
| Quality Preservation |
100% |
Task completion rate |
Hypotheses
- H1: CacheBro reduces token usage by ≥20% (approaching claimed 26%)
- H2: CacheBro reduces API costs by ≥20%
- H3: CacheBro maintains task completion rate within 5% of baseline
- H4: Tool benefit varies based on file access patterns
Implementation
Phase 1: Setup (2 days)
Phase 2: Pilot (2 days)
Phase 3: Full Run (5 days)
Phase 4: Analysis (2 days)
Phase 5: Reporting (2 days)
Deliverables
- Raw data from 180 test runs (JSON)
- CacheBro SQLite databases from each session
- Statistical analysis notebook (Jupyter)
- Visualization dashboard
- Final report with recommendations
Risks & Mitigations
| Risk |
Likelihood |
Mitigation |
| MCP integration issues |
Medium |
Test each tool's MCP support early |
| Low cache hit rate |
Low |
Select file-heavy issues |
| Tool crashes |
Medium |
Implement retry logic |
References
Timeline
Total: 13 days
- Setup: 2 days
- Pilot: 2 days
- Full Run: 5 days
- Analysis: 2 days
- Reporting: 2 days
/cc @AlexMikhalev
/label ~experiment ~performance ~cachebro
/milestone %Q1-2026
Summary
Design and execute a comprehensive A/B test to measure the impact of CacheBro (MCP-based file caching) on AI coding agent performance across three major CLI tools: OpenCode, Claude Code, and Codex CLI.
Background
CacheBro is an MCP server that caches file reads for AI coding agents, returning diffs or "unchanged" confirmations instead of full file content. The project claims ~26% token savings.
CacheBro Architecture:
Objectives
Test Design
Issues Selection
Experimental Conditions
Key Metrics
Hypotheses
Implementation
Phase 1: Setup (2 days)
Phase 2: Pilot (2 days)
Phase 3: Full Run (5 days)
Phase 4: Analysis (2 days)
Phase 5: Reporting (2 days)
Deliverables
Risks & Mitigations
References
Timeline
Total: 13 days
/cc @AlexMikhalev
/label ~experiment ~performance ~cachebro
/milestone %Q1-2026