Skip to content

Commit 5e47945

Browse files
committed
Add hierarchical reasoning optimizations inspired by HRM paper
Inspired by the Hierarchical Reasoning Model (arXiv:2506.21734v3), this implements workflow-level adaptations of HRM principles—not the actual neural network algorithms, but the conceptual spirit. Implements 4 key optimizations: - INNOVATE convergence criteria (exploration assessment) - PLAN quality gate (self-validation before save) - REVIEW phase routing (hierarchical error correction) - Memory learning algorithm (pattern-based recommendations) Translates HRM principles into workflow design: - Hierarchical convergence → phase-level convergence criteria - Deep supervision → quality gates at every phase - Adaptive computation → pattern-based learning from history - Recurrent feedback → phase routing for iterative refinement Adds 223 lines across 6 files, all backward compatible. No new files, all changes within existing agent/command structure. License: CC BY 4.0 Paper reference: arXiv:2506.21734v3 [cs.AI] 04 Aug 2025
1 parent bc3ea2a commit 5e47945

File tree

6 files changed

+217
-5
lines changed

6 files changed

+217
-5
lines changed

.claude/agents/plan-execute.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,21 @@ git log -n 10 --oneline --grep="WIP\|TODO\|FIXME"
4444
- Write to repository root `.claude/memory-bank/*/plans/` ONLY (use `git rev-parse --show-toplevel` to find root)
4545
- Identify risks and mitigations
4646

47+
**Plan Quality Gate (Self-Validation)**:
48+
49+
Before saving plan, verify:
50+
- **Completeness**: All research findings addressed (Y/N)
51+
- **Testability**: Success criteria measurable (Y/N)
52+
- **Risk Coverage**: Potential issues identified (Y/N)
53+
- **Step Clarity**: Each step actionable without ambiguity (Y/N)
54+
- **Plan Confidence**: 1-10 score on implementation readiness
55+
56+
**Quality Rule**:
57+
- If any item = N OR confidence < 8: Refine plan, don't save yet
58+
- If all items = Y AND confidence >= 8: Save and mark ready for approval
59+
60+
Document in plan header: `[PLAN QUALITY: completeness=Y, testability=Y, risks=Y, clarity=Y, confidence=X/10]`
61+
4762
**FORBIDDEN Actions**:
4863
- Writing actual code to project files
4964
- Executing implementation commands
@@ -77,6 +92,32 @@ git log -n 5 --oneline --since=[plan-creation-date]
7792
- Execute build and test commands
7893
- Follow plan steps sequentially
7994

95+
**Substep Validation Loop (Deep Supervision)**:
96+
97+
After EACH implementation substep:
98+
99+
1. **Immediate Validation**:
100+
- [ ] Matches plan specification exactly
101+
- [ ] No unplanned modifications
102+
- [ ] Tests pass for this substep
103+
- [ ] No regressions introduced
104+
105+
2. **Confidence Assessment**: Rate substep quality 1-10
106+
107+
3. **Decision Logic**:
108+
- If validation fails OR confidence < 7: Document deviation, halt for guidance
109+
- If validation passes AND confidence >= 7: Mark complete, continue
110+
111+
4. **Output Format**:
112+
```
113+
[SUBSTEP X.Y VALIDATION]
114+
Status: [PASS/FAIL]
115+
Confidence: X/10
116+
Issues: [None/Description]
117+
```
118+
119+
This implements continuous validation rather than waiting for REVIEW phase.
120+
80121
**FORBIDDEN Actions**:
81122
- Deviating from approved plan
82123
- Adding improvements not specified

.claude/agents/research-innovate.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,20 @@ git log --oneline main..HEAD
4444
- Ask clarifying questions
4545
- Gather context and dependencies
4646

47+
**Convergence Criteria (Self-Assessment)**:
48+
49+
Before exiting RESEARCH sub-mode, evaluate:
50+
- **Understanding Confidence**: 1-10 score on codebase comprehension
51+
- **Dependencies Mapped**: All critical dependencies identified (Y/N)
52+
- **Edge Cases Considered**: Non-obvious scenarios documented (Y/N)
53+
- **Context Completeness**: Sufficient information for planning (Y/N)
54+
55+
**Convergence Rule**:
56+
- If confidence < 8/10 OR any critical item = N: Continue research iteration
57+
- If confidence >= 8/10 AND all critical items = Y: Ready for next phase
58+
59+
Document assessment in output with: `[CONVERGENCE: confidence=X/10, ready=Y/N]`
60+
4761
**FORBIDDEN Actions**:
4862
- Suggesting solutions or implementations
4963
- Making design decisions
@@ -61,6 +75,20 @@ git log --oneline main..HEAD
6175
- Question assumptions
6276
- Present possibilities without commitment
6377

78+
**Convergence Criteria (Exploration Assessment)**:
79+
80+
Before exiting INNOVATE sub-mode, evaluate:
81+
- **Approach Diversity**: Explored 2-3 distinct approaches (Y/N)
82+
- **Trade-offs Clarity**: Pros/cons clearly understood for each (Y/N)
83+
- **Best Path Identified**: Clear recommendation emerging (Y/N)
84+
- **Exploration Confidence**: 1-10 score on solution space coverage
85+
86+
**Convergence Rule**:
87+
- If approaches < 2 OR clarity = N: Continue innovation iteration
88+
- If approaches >= 2 AND clarity = Y AND confidence >= 7: Ready for PLAN
89+
90+
Document assessment in output with: `[INNOVATION CONVERGENCE: approaches=X, confidence=X/10, ready=Y/N]`
91+
6492
**FORBIDDEN Actions**:
6593
- Creating concrete plans
6694
- Writing code or pseudocode

.claude/agents/review.md

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -157,9 +157,35 @@ Formatting: [PASS/FAIL] - Z files need formatting
157157
1. [Suggested action]
158158
2. [Suggested action]
159159
160-
### Next Steps
161-
- [ ] If PASS: Implementation ready for deployment
162-
- [ ] If FAIL: Return to PLAN or EXECUTE mode to address issues
160+
### Phase Routing Decision
161+
162+
Based on issue severity, route to appropriate hierarchy level:
163+
164+
**→ EXECUTE** (implementation-level issues):
165+
- Single-step implementation errors
166+
- Missing edge case handling
167+
- Code quality issues (lint/format)
168+
- Command: `/riper:execute [substep]`
169+
170+
**→ PLAN** (design-level issues):
171+
- Wrong approach taken
172+
- Missing features not in plan
173+
- Architecture mismatch
174+
- Action: Create amended plan with lessons learned
175+
176+
**→ RESEARCH** (understanding-level issues):
177+
- Misunderstood requirements
178+
- Missing critical context
179+
- Wrong problem being solved
180+
- Action: Re-research with focus on identified gaps
181+
182+
**→ DEPLOY** (approved):
183+
- All checks passed
184+
- Minor warnings acceptable
185+
- Implementation matches plan exactly
186+
187+
**Decision for this review**: [EXECUTE/PLAN/RESEARCH/DEPLOY]
188+
**Rationale**: [Why this level is appropriate]
163189
```
164190

165191
## Review Artifacts

.claude/commands/memory/recall.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,47 @@ Looking for memories in: `[ROOT]/.claude/memory-bank/!`git branch --show-current
4444
## Search Query
4545
$ARGUMENTS
4646

47+
## Adaptive Context & Learning
48+
49+
**Pattern Matching Algorithm:**
50+
1. **Identify similar tasks**: Match by keywords, file patterns, domain context
51+
2. **Analyze success patterns**: Find tasks that succeeded on first try
52+
3. **Analyze failure patterns**: Find tasks that required multiple iterations
53+
4. **Extract optimal strategy**: What iteration counts worked best?
54+
55+
**Learning Rules:**
56+
- If similar task had LOW research confidence → **Increase research iterations**
57+
- If similar task had PLAN failures → **Add stricter quality gates**
58+
- If similar task had EXECUTE issues → **Increase validation frequency**
59+
- If similar task succeeded with X iterations → **Recommend X as baseline**
60+
61+
**Pattern Examples to Look For:**
62+
- "Auth tasks typically need COMPLEX classification (6+ files)"
63+
- "API refactoring succeeds with 2 research iterations, 8/10 threshold"
64+
- "UI components work well with SIMPLE tier (1 iteration)"
65+
- "Database migrations require MODERATE, 2 PLAN iterations"
66+
67+
**Output Format:**
68+
```
69+
📊 LEARNED PATTERNS for "$ARGUMENTS":
70+
71+
Similar tasks found: [list 2-3 most relevant matches]
72+
- Task: [name] | Complexity: [tier] | Research iterations: [N] | Result: [SUCCESS/FAILED]
73+
- Task: [name] | Complexity: [tier] | Research iterations: [N] | Result: [SUCCESS/FAILED]
74+
75+
Success pattern identified:
76+
- [What approach worked consistently]
77+
- [Key factors that led to success]
78+
79+
Recommended strategy:
80+
- Complexity tier: [SIMPLE/MODERATE/COMPLEX]
81+
- Research iterations: [N] (threshold: [X/10])
82+
- Execute validation: [STANDARD/ENHANCED]
83+
- Confidence: [X/10] based on [Y] historical examples
84+
85+
⚠️ Watch out for: [Common pitfalls from similar tasks]
86+
```
87+
4788
## Available Memories
4889
!`ls -la $(git rev-parse --show-toplevel)/.claude/memory-bank/$(git branch --show-current)/ 2>/dev/null || echo "No memories found for current branch"`
4990

.claude/commands/memory/save.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,23 @@ I'll save the following information to the branch-aware memory bank:
2828
## Memory Content
2929
$ARGUMENTS
3030

31+
## Metadata (for adaptive workflow)
32+
- **Task Complexity**: [SIMPLE/MODERATE/COMPLEX]
33+
- SIMPLE: 1-2 files, well-defined scope
34+
- MODERATE: 3-5 files, some ambiguity
35+
- COMPLEX: 6+ files, architectural changes
36+
37+
- **Phase Confidence Scores**:
38+
- Research confidence: X/10
39+
- Plan quality: X/10
40+
- Execute confidence: X/10
41+
42+
- **Iteration Count**:
43+
- Research iterations: X
44+
- Execute iterations: X
45+
46+
- **Convergence Notes**: [Why this task required X iterations]
47+
3148
## Storage Location
3249
The memory will be saved to:
3350
1. First run: `git rev-parse --show-toplevel` to get repository root

.claude/commands/riper/workflow.md

Lines changed: 61 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,67 @@ Once approved, I'll use the plan-execute agent in EXECUTE sub-mode to implement
2525
### Phase 5: REVIEW
2626
Finally, I'll use the review agent to validate the implementation against the plan.
2727

28-
## Starting Workflow
28+
## Starting Workflow - Adaptive Mode
2929

30-
Let me begin with the RESEARCH phase for: $ARGUMENTS
30+
### Step 0: Complexity Assessment
31+
32+
First, let me assess task complexity and recall similar past tasks:
33+
34+
**Complexity Factors**:
35+
- File count estimate: [1-2 / 3-5 / 6+]
36+
- Architectural impact: [LOW / MEDIUM / HIGH]
37+
- Ambiguity level: [CLEAR / SOME / SIGNIFICANT]
38+
39+
**Classification**:
40+
- **SIMPLE**: 1-2 files, well-defined scope, low ambiguity
41+
- **MODERATE**: 3-5 files, some design decisions, moderate ambiguity
42+
- **COMPLEX**: 6+ files, architectural changes, high ambiguity
43+
44+
Checking memory: `/memory:recall similar to: $ARGUMENTS`
45+
46+
### Adaptive Phase Execution
47+
48+
Based on complexity assessment, I'll follow the appropriate workflow:
49+
50+
**For SIMPLE tasks**:
51+
1. RESEARCH (1 iteration, convergence threshold: 7/10)
52+
2. PLAN (streamlined)
53+
3. EXECUTE (with substep validation)
54+
4. REVIEW
55+
56+
**For MODERATE tasks**:
57+
1. RESEARCH (up to 2 iterations, convergence threshold: 8/10)
58+
2. INNOVATE (explore 2-3 approaches)
59+
3. PLAN (detailed)
60+
4. EXECUTE (with deep supervision)
61+
5. REVIEW
62+
63+
**For COMPLEX tasks**:
64+
1. RESEARCH (up to 3 iterations, convergence threshold: 9/10)
65+
2. INNOVATE (extensive exploration)
66+
3. PLAN (comprehensive with risk analysis)
67+
4. EXECUTE (substep-by-substep with validation)
68+
5. Mid-execution REVIEW (after 50% complete)
69+
6. Final REVIEW
70+
71+
### Hierarchical Convergence Control
72+
73+
**Research Phase**: Continue iterations until convergence criteria met:
74+
- Understanding confidence >= threshold
75+
- All dependencies mapped
76+
- Edge cases considered
77+
- Context complete
78+
79+
**Execute Phase**: Validate after each substep (deep supervision):
80+
- Check plan compliance
81+
- Assess confidence (must be >= 7/10)
82+
- Document any deviations
83+
- Halt if validation fails
84+
85+
**Memory Tracking**: Save all confidence scores and iteration counts for future workflow optimization.
86+
87+
### Beginning Workflow
88+
89+
Let me begin with complexity assessment and RESEARCH phase for: $ARGUMENTS
3190

3291
[The appropriate agent will be invoked based on the current phase]

0 commit comments

Comments
 (0)