terraphim · AlexMikhalev · Apr 20, 2026 · Apr 16, 2026 · Apr 17, 2026 · Apr 17, 2026
diff --git a/.docs/design-pr818-clippy-lint-fixes.md b/.docs/design-pr818-clippy-lint-fixes.md
@@ -0,0 +1,119 @@
+# Implementation Plan: Fix CI `sort_by` Clippy Failures for PR #818
+
+**Status**: Approved
+**Research Doc**: `.docs/research-pr818-clippy-lint-fixes.md`
+**Author**: opencode (disciplined-design)
+**Date**: 2026-04-17
+**Estimated Effort**: 30 minutes
+
+## Overview
+
+### Summary
+Replace all `sort_by` calls flagged by Rust 1.95 clippy with `sort_by_key` equivalents. 13 mechanical conversions + 3 special cases.
+
+### Approach
+Mechanical search-and-replace. Each `sort_by` converting to `sort_by_key` either:
+- **Ascending**: `sort_by(|a, b| a.x.cmp(&b.x))` -> `sort_by_key(|a| a.x)`
+- **Descending**: `sort_by(|a, b| b.x.cmp(&a.x))` -> `sort_by_key(|b| std::cmp::Reverse(b.x))`
+
+### Scope
+**In Scope:**
+- Fix all 13 clippy-flagged `sort_by` calls across 5 files in 3 crates
+- Handle 3 special cases (multi-key, string-parse, IndexMap)
+
+**Out of Scope:**
+- `tui_desktop_parity_test.rs` type annotation errors
+- Upgrading local Rust version
+- Any changes to terraphim-cli or terraphim-automata (branch's own code)
+
+## File Changes
+
+### Modified Files
+| File | Changes |
+|------|---------|
+| `crates/terraphim-markdown-parser/src/lib.rs` | 1 sort_by -> sort_by_key |
+| `crates/terraphim_router/src/keyword.rs` | 1 sort_by -> sort_by_key |
+| `crates/terraphim-session-analyzer/src/analyzer.rs` | 4 sort_by -> sort_by_key + 2 IndexMap sort_by |
+| `crates/terraphim-session-analyzer/src/parser.rs` | 1 sort_by -> sort_by_key |
+| `crates/terraphim-session-analyzer/src/reporter.rs` | 7 sort_by -> sort_by_key |
+
+### No new or deleted files.
+
+## Implementation Steps
+
+### Step 1: terraphim-markdown-parser (1 change)
+**File:** `crates/terraphim-markdown-parser/src/lib.rs:144`
+
+```rust
+// Before:
+edits.sort_by(|a, b| b.range.start.cmp(&a.range.start));
+// After:
+edits.sort_by_key(|b| std::cmp::Reverse(b.range.start));
+```
+
+### Step 2: terraphim_router (1 change)
+**File:** `crates/terraphim_router/src/keyword.rs:61`
+
+```rust
+// Before:
+matched_keywords.sort_by(|a, b| b.1.cmp(&a.1));
+// After:
+matched_keywords.sort_by_key(|b| std::cmp::Reverse(b.1));
+```
+
+### Step 3: terraphim-session-analyzer/analyzer.rs (6 changes)
+**File:** `crates/terraphim-session-analyzer/src/analyzer.rs`
+
+| Line | Before | After |
+|------|--------|-------|
+| 210 | `agents.sort_by(\|a, b\| a.timestamp.cmp(&b.timestamp))` | `agents.sort_by_key(\|a\| a.timestamp)` |
+| 564 | `sorted.sort_by(\|a, b\| b.1.cmp(&a.1))` | `sorted.sort_by_key(\|b\| std::cmp::Reverse(b.1))` |
+| 637 | `correlations.sort_by(\|a, b\| b.usage_count.cmp(&a.usage_count))` | `correlations.sort_by_key(\|b\| std::cmp::Reverse(b.usage_count))` |
+| 718 | `stats.sort_by(\|_, v1, _, v2\| v2.total_invocations.cmp(&v1.total_invocations))` | `stats.sort_by(\|_, v1, _, v2\| v2.total_invocations.cmp(&v1.total_invocations))` **ALLOW** (IndexMap method, not Vec) |
+| 739 | `breakdown.sort_by(\|_, v1, _, v2\| v2.cmp(v1))` | `breakdown.sort_by(\|_, v1, _, v2\| v2.cmp(v1))` **ALLOW** (IndexMap method, not Vec) |
+| 880 | `chains.sort_by(\|a, b\| b.frequency.cmp(&a.frequency))` | `chains.sort_by_key(\|b\| std::cmp::Reverse(b.frequency))` |
+
+**Lines 718, 739**: These use `IndexMap::sort_by` (not `Vec::sort_by`). Clippy's `unnecessary_sort_by` lint only fires on `Vec::sort_by`, so these should be safe. Verify after fix by running clippy. If they do fire, add `#[allow(clippy::unnecessary_sort_by)]`.
+
+### Step 4: terraphim-session-analyzer/parser.rs (1 change)
+**File:** `crates/terraphim-session-analyzer/src/parser.rs:422`
+
+```rust
+// Before:
+events.sort_by(|a, b| a.timestamp.cmp(&b.timestamp));
+// After:
+events.sort_by_key(|a| a.timestamp);
+```
+
+### Step 5: terraphim-session-analyzer/reporter.rs (7 changes)
+**File:** `crates/terraphim-session-analyzer/src/reporter.rs`
+
+| Line | Before | After |
+|------|--------|-------|
+| 168 | `events.sort_by(\|a, b\| a.0.cmp(&b.0))` | `events.sort_by_key(\|a\| a.0)` |
+| 227 | `sorted_agents.sort_by(\|a, b\| b.1.cmp(&a.1))` | `sorted_agents.sort_by_key(\|b\| std::cmp::Reverse(b.1))` |
+| 447 | `tool_stats.sort_by(\|a, b\| b.1.total_invocations.cmp(&a.1.total_invocations))` | `tool_stats.sort_by_key(\|b\| std::cmp::Reverse(b.1.total_invocations))` |
+| 548 | Multi-key with string parse | Add `#[allow(clippy::unnecessary_sort_by)]` (see note) |
+| 570 | `category_rows.sort_by(\|a, b\| b.1.cmp(&a.1))` | `category_rows.sort_by_key(\|b\| std::cmp::Reverse(b.1))` |
+| 763 | `category_rows.sort_by(\|a, b\| b.1.cmp(&a.1))` | `category_rows.sort_by_key(\|b\| std::cmp::Reverse(b.1))` |
+| 784 | `tool_list.sort_by(\|a, b\| b.1.total_invocations.cmp(&a.1.total_invocations))` | `tool_list.sort_by_key(\|b\| std::cmp::Reverse(b.1.total_invocations))` |
+
+**Line 548**: Sorts by parsing `count` field from `String` to `u32`. Cannot be expressed as a pure key function because `parse()` is fallible. Add `#[allow(clippy::unnecessary_sort_by)]` attribute above the sort call.
+
+### Step 6: Verify
+```bash
+rustup update stable  # Update to 1.95
+cargo clippy --workspace --all-targets -- -D warnings
+cargo test --workspace
+cargo fmt --check
+```
+
+Then push and monitor CI.
+
+## Test Strategy
+
+No new tests needed -- these are semantically equivalent transformations. Existing tests cover the sorting behaviour.
+
+## Rollback Plan
+
+`git revert` the commit if any sorting behaviour regresses.
diff --git a/.docs/implementation-plan-issue-576-evaluate-cli.md b/.docs/implementation-plan-issue-576-evaluate-cli.md
@@ -0,0 +1,195 @@
+# Implementation Plan: Issue #576 - Ground-truth Evaluation Framework CLI Integration
+
+**Status**: Draft
+**Research Doc**: `.docs/research-issue-576-evaluation-cli.md`
+**Author**: Claude
+**Date**: 2026-04-16
+**Estimated Effort**: 2-3 hours
+
+## Overview
+
+### Summary
+Wire the existing `evaluate_automata()` function from `terraphim_automata` to a CLI subcommand in `terraphim_cli`. The core evaluation logic is already implemented; this is CLI integration only.
+
+### Approach
+Add an `Evaluate` subcommand to `terraphim_cli` that:
+1. Loads ground truth JSON from `--ground-truth` flag
+2. Loads thesaurus from `--thesaurus` flag
+3. Calls `evaluate()` function
+4. Outputs JSON `EvaluationResult`
+
+### Scope
+**In Scope:**
+- Add `Evaluate` command variant to `Commands` enum in `terraphim_cli`
+- Implement `evaluate` subcommand handler in `CliService`
+- Add `terraphim_automata` dependency to `terraphim_cli`
+- Integration tests for evaluate command
+
+**Out of Scope:**
+- New evaluation metrics (already in evaluation.rs)
+- Confusion matrix output (not in issue spec)
+- terraphim_agent integration (terraphim_cli is better fit for automation)
+
+**Avoid At All Cost:**
+- Adding human-readable output (automation-focused, JSON only)
+- Modifying evaluation.rs (already complete and tested)
+- Adding schema-based evaluation (future work)
+
+## Architecture
+
+### Component Diagram
+```
+terraphim_cli
+└── Evaluate command
+    ├── load_ground_truth(path)  --> GroundTruthDocument[]
+    ├── load_thesaurus(path)      --> Thesaurus
+    └── evaluate(gt, thesaurus)  --> EvaluationResult
+                                    ├── overall: ClassificationMetrics
+                                    ├── per_term: TermReport[]
+                                    └── systematic_errors: SystematicError[]
+```
+
+### Data Flow
+```
+CLI args (--ground-truth, --thesaurus)
+    -> CliService::evaluate()
+    -> terraphim_automata::load_ground_truth()
+    -> terraphim_automata::load_thesaurus()
+    -> terraphim_automata::evaluate()
+    -> EvaluationResult (JSON serialized)
+    -> stdout
+```
+
+## File Changes
+
+### Modified Files
+| File | Changes |
+|------|---------|
+| `crates/terraphim_cli/Cargo.toml` | Add `terraphim_automata` dependency |
+| `crates/terraphim_cli/src/main.rs` | Add `Evaluate` variant to `Commands` enum |
+| `crates/terraphim_cli/src/service.rs` | Add `evaluate()` method to `CliService` |
+| `crates/terraphim_cli/tests/integration_tests.rs` | Add integration test |
+
+## API Design
+
+### CLI Command Signature
+```rust
+/// Evaluate automata classification accuracy against ground truth
+Evaluate {
+    /// Path to ground truth JSON file
+    #[arg(long)]
+    ground_truth: String,
+
+    /// Path to thesaurus JSON file
+    #[arg(long)]
+    thesaurus: String,
+}
+```
+
+### Types (already in terraphim_automata::evaluation)
+```rust
+// GroundTruthDocument - already exists
+// EvaluationResult - already exists
+// ClassificationMetrics - already exists
+// SystematicError - already exists
+```
+
+## Implementation Steps
+
+### Step 1: Add Evaluate to Commands enum
+**File:** `crates/terraphim_cli/src/main.rs`
+**Lines:** ~100-110 (after `Coverage` variant)
+**Change:**
+```rust
+/// Evaluate automata classification against ground truth
+Evaluate {
+    /// Path to ground truth JSON file
+    #[arg(long)]
+    ground_truth: String,
+
+    /// Path to thesaurus JSON file
+    #[arg(long)]
+    thesaurus: String,
+},
+```
+
+### Step 2: Add terraphim_automata dependency
+**File:** `crates/terraphim_cli/Cargo.toml`
+**Change:**
+```toml
+terraphim-automata = { path = "../terraphim_automata", features = [] }
+```
+
+### Step 3: Add evaluate() method to CliService
+**File:** `crates/terraphim_cli/src/service.rs`
+**Change:** Add method that:
+1. Loads ground truth via `terraphim_automata::load_ground_truth()`
+2. Loads thesaurus via `terraphim_automata::load_thesaurus()`
+3. Calls `terraphim_automata::evaluate()`
+4. Serializes and prints result as JSON
+
+### Step 4: Wire Evaluate in main.rs match
+**File:** `crates/terraphim_cli/src/main.rs`
+**Location:** ~400 (in the Command::run() match)
+**Change:**
+```rust
+Commands::Evaluate { ground_truth, thesaurus } => {
+    cli_service.evaluate(&ground_truth, &thesaurus).await?;
+}
+```
+
+### Step 5: Add integration test
+**File:** `crates/terraphim_cli/tests/integration_tests.rs`
+**Change:** Add test that:
+1. Creates temporary ground truth JSON
+2. Creates temporary thesaurus JSON
+3. Runs evaluate command
+4. Verifies JSON output contains expected fields
+
+## Test Strategy
+
+### Unit Tests
+No new unit tests needed - evaluation logic is already tested in `terraphim_automata`
+
+### Integration Tests
+| Test | Location | Purpose |
+|------|----------|---------|
+| `test_evaluate_command_success` | `integration_tests.rs` | Full flow with valid files |
+| `test_evaluate_command_missing_ground_truth` | `integration_tests.rs` | Error handling |
+| `test_evaluate_command_missing_thesaurus` | `integration_tests.rs` | Error handling |
+
+### Test Data
+Use temporary files created in tests (no external fixtures needed)
+
+## Dependencies
+
+### New Dependencies
+| Crate | Version | Justification |
+|-------|---------|---------------|
+| terraphim-automata | path | Required for evaluation module |
+
+## Performance Considerations
+
+- Evaluation is O(n*m) where n=documents, m=thesaurus terms
+- Expected: <100ms for typical workloads
+- No benchmarks needed for CLI wrapper
+
+## Rollback Plan
+
+If issues discovered:
+1. Remove `Evaluate` variant from `Commands` enum
+2. Remove `evaluate()` method from `CliService`
+3. Remove `terraphim_automata` dependency
+
+## Simplicity Check
+
+**What if this could be easy?**
+This IS simple - just 4 files to modify, no new types, no complex logic. The evaluation module is already complete. This is a thin CLI wrapper.
+
+## Approval Gate
+
+- [ ] Research document reviewed and approved
+- [ ] Implementation plan reviewed
+- [ ] File changes defined
+- [ ] Test strategy defined
+- [ ] Human approval received