Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 119 additions & 0 deletions .docs/design-pr818-clippy-lint-fixes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# Implementation Plan: Fix CI `sort_by` Clippy Failures for PR #818

**Status**: Approved
**Research Doc**: `.docs/research-pr818-clippy-lint-fixes.md`
**Author**: opencode (disciplined-design)
**Date**: 2026-04-17
**Estimated Effort**: 30 minutes

## Overview

### Summary
Replace all `sort_by` calls flagged by Rust 1.95 clippy with `sort_by_key` equivalents. 13 mechanical conversions + 3 special cases.

### Approach
Mechanical search-and-replace. Each `sort_by` converting to `sort_by_key` either:
- **Ascending**: `sort_by(|a, b| a.x.cmp(&b.x))` -> `sort_by_key(|a| a.x)`
- **Descending**: `sort_by(|a, b| b.x.cmp(&a.x))` -> `sort_by_key(|b| std::cmp::Reverse(b.x))`

### Scope
**In Scope:**
- Fix all 13 clippy-flagged `sort_by` calls across 5 files in 3 crates
- Handle 3 special cases (multi-key, string-parse, IndexMap)

**Out of Scope:**
- `tui_desktop_parity_test.rs` type annotation errors
- Upgrading local Rust version
- Any changes to terraphim-cli or terraphim-automata (branch's own code)

## File Changes

### Modified Files
| File | Changes |
|------|---------|
| `crates/terraphim-markdown-parser/src/lib.rs` | 1 sort_by -> sort_by_key |
| `crates/terraphim_router/src/keyword.rs` | 1 sort_by -> sort_by_key |
| `crates/terraphim-session-analyzer/src/analyzer.rs` | 4 sort_by -> sort_by_key + 2 IndexMap sort_by |
| `crates/terraphim-session-analyzer/src/parser.rs` | 1 sort_by -> sort_by_key |
| `crates/terraphim-session-analyzer/src/reporter.rs` | 7 sort_by -> sort_by_key |

### No new or deleted files.

## Implementation Steps

### Step 1: terraphim-markdown-parser (1 change)
**File:** `crates/terraphim-markdown-parser/src/lib.rs:144`

```rust
// Before:
edits.sort_by(|a, b| b.range.start.cmp(&a.range.start));
// After:
edits.sort_by_key(|b| std::cmp::Reverse(b.range.start));
```

### Step 2: terraphim_router (1 change)
**File:** `crates/terraphim_router/src/keyword.rs:61`

```rust
// Before:
matched_keywords.sort_by(|a, b| b.1.cmp(&a.1));
// After:
matched_keywords.sort_by_key(|b| std::cmp::Reverse(b.1));
```

### Step 3: terraphim-session-analyzer/analyzer.rs (6 changes)
**File:** `crates/terraphim-session-analyzer/src/analyzer.rs`

| Line | Before | After |
|------|--------|-------|
| 210 | `agents.sort_by(\|a, b\| a.timestamp.cmp(&b.timestamp))` | `agents.sort_by_key(\|a\| a.timestamp)` |
| 564 | `sorted.sort_by(\|a, b\| b.1.cmp(&a.1))` | `sorted.sort_by_key(\|b\| std::cmp::Reverse(b.1))` |
| 637 | `correlations.sort_by(\|a, b\| b.usage_count.cmp(&a.usage_count))` | `correlations.sort_by_key(\|b\| std::cmp::Reverse(b.usage_count))` |
| 718 | `stats.sort_by(\|_, v1, _, v2\| v2.total_invocations.cmp(&v1.total_invocations))` | `stats.sort_by(\|_, v1, _, v2\| v2.total_invocations.cmp(&v1.total_invocations))` **ALLOW** (IndexMap method, not Vec) |
| 739 | `breakdown.sort_by(\|_, v1, _, v2\| v2.cmp(v1))` | `breakdown.sort_by(\|_, v1, _, v2\| v2.cmp(v1))` **ALLOW** (IndexMap method, not Vec) |
| 880 | `chains.sort_by(\|a, b\| b.frequency.cmp(&a.frequency))` | `chains.sort_by_key(\|b\| std::cmp::Reverse(b.frequency))` |

**Lines 718, 739**: These use `IndexMap::sort_by` (not `Vec::sort_by`). Clippy's `unnecessary_sort_by` lint only fires on `Vec::sort_by`, so these should be safe. Verify after fix by running clippy. If they do fire, add `#[allow(clippy::unnecessary_sort_by)]`.

### Step 4: terraphim-session-analyzer/parser.rs (1 change)
**File:** `crates/terraphim-session-analyzer/src/parser.rs:422`

```rust
// Before:
events.sort_by(|a, b| a.timestamp.cmp(&b.timestamp));
// After:
events.sort_by_key(|a| a.timestamp);
```

### Step 5: terraphim-session-analyzer/reporter.rs (7 changes)
**File:** `crates/terraphim-session-analyzer/src/reporter.rs`

| Line | Before | After |
|------|--------|-------|
| 168 | `events.sort_by(\|a, b\| a.0.cmp(&b.0))` | `events.sort_by_key(\|a\| a.0)` |
| 227 | `sorted_agents.sort_by(\|a, b\| b.1.cmp(&a.1))` | `sorted_agents.sort_by_key(\|b\| std::cmp::Reverse(b.1))` |
| 447 | `tool_stats.sort_by(\|a, b\| b.1.total_invocations.cmp(&a.1.total_invocations))` | `tool_stats.sort_by_key(\|b\| std::cmp::Reverse(b.1.total_invocations))` |
| 548 | Multi-key with string parse | Add `#[allow(clippy::unnecessary_sort_by)]` (see note) |
| 570 | `category_rows.sort_by(\|a, b\| b.1.cmp(&a.1))` | `category_rows.sort_by_key(\|b\| std::cmp::Reverse(b.1))` |
| 763 | `category_rows.sort_by(\|a, b\| b.1.cmp(&a.1))` | `category_rows.sort_by_key(\|b\| std::cmp::Reverse(b.1))` |
| 784 | `tool_list.sort_by(\|a, b\| b.1.total_invocations.cmp(&a.1.total_invocations))` | `tool_list.sort_by_key(\|b\| std::cmp::Reverse(b.1.total_invocations))` |

**Line 548**: Sorts by parsing `count` field from `String` to `u32`. Cannot be expressed as a pure key function because `parse()` is fallible. Add `#[allow(clippy::unnecessary_sort_by)]` attribute above the sort call.

### Step 6: Verify
```bash
rustup update stable # Update to 1.95
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
cargo fmt --check
```

Then push and monitor CI.

## Test Strategy

No new tests needed -- these are semantically equivalent transformations. Existing tests cover the sorting behaviour.

## Rollback Plan

`git revert` the commit if any sorting behaviour regresses.
195 changes: 195 additions & 0 deletions .docs/implementation-plan-issue-576-evaluate-cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
# Implementation Plan: Issue #576 - Ground-truth Evaluation Framework CLI Integration

**Status**: Draft
**Research Doc**: `.docs/research-issue-576-evaluation-cli.md`
**Author**: Claude
**Date**: 2026-04-16
**Estimated Effort**: 2-3 hours

## Overview

### Summary
Wire the existing `evaluate_automata()` function from `terraphim_automata` to a CLI subcommand in `terraphim_cli`. The core evaluation logic is already implemented; this is CLI integration only.

### Approach
Add an `Evaluate` subcommand to `terraphim_cli` that:
1. Loads ground truth JSON from `--ground-truth` flag
2. Loads thesaurus from `--thesaurus` flag
3. Calls `evaluate()` function
4. Outputs JSON `EvaluationResult`

### Scope
**In Scope:**
- Add `Evaluate` command variant to `Commands` enum in `terraphim_cli`
- Implement `evaluate` subcommand handler in `CliService`
- Add `terraphim_automata` dependency to `terraphim_cli`
- Integration tests for evaluate command

**Out of Scope:**
- New evaluation metrics (already in evaluation.rs)
- Confusion matrix output (not in issue spec)
- terraphim_agent integration (terraphim_cli is better fit for automation)

**Avoid At All Cost:**
- Adding human-readable output (automation-focused, JSON only)
- Modifying evaluation.rs (already complete and tested)
- Adding schema-based evaluation (future work)

## Architecture

### Component Diagram
```
terraphim_cli
└── Evaluate command
├── load_ground_truth(path) --> GroundTruthDocument[]
├── load_thesaurus(path) --> Thesaurus
└── evaluate(gt, thesaurus) --> EvaluationResult
├── overall: ClassificationMetrics
├── per_term: TermReport[]
└── systematic_errors: SystematicError[]
```

### Data Flow
```
CLI args (--ground-truth, --thesaurus)
-> CliService::evaluate()
-> terraphim_automata::load_ground_truth()
-> terraphim_automata::load_thesaurus()
-> terraphim_automata::evaluate()
-> EvaluationResult (JSON serialized)
-> stdout
```

## File Changes

### Modified Files
| File | Changes |
|------|---------|
| `crates/terraphim_cli/Cargo.toml` | Add `terraphim_automata` dependency |
| `crates/terraphim_cli/src/main.rs` | Add `Evaluate` variant to `Commands` enum |
| `crates/terraphim_cli/src/service.rs` | Add `evaluate()` method to `CliService` |
| `crates/terraphim_cli/tests/integration_tests.rs` | Add integration test |

## API Design

### CLI Command Signature
```rust
/// Evaluate automata classification accuracy against ground truth
Evaluate {
/// Path to ground truth JSON file
#[arg(long)]
ground_truth: String,

/// Path to thesaurus JSON file
#[arg(long)]
thesaurus: String,
}
```

### Types (already in terraphim_automata::evaluation)
```rust
// GroundTruthDocument - already exists
// EvaluationResult - already exists
// ClassificationMetrics - already exists
// SystematicError - already exists
```

## Implementation Steps

### Step 1: Add Evaluate to Commands enum
**File:** `crates/terraphim_cli/src/main.rs`
**Lines:** ~100-110 (after `Coverage` variant)
**Change:**
```rust
/// Evaluate automata classification against ground truth
Evaluate {
/// Path to ground truth JSON file
#[arg(long)]
ground_truth: String,

/// Path to thesaurus JSON file
#[arg(long)]
thesaurus: String,
},
```

### Step 2: Add terraphim_automata dependency
**File:** `crates/terraphim_cli/Cargo.toml`
**Change:**
```toml
terraphim-automata = { path = "../terraphim_automata", features = [] }
```

### Step 3: Add evaluate() method to CliService
**File:** `crates/terraphim_cli/src/service.rs`
**Change:** Add method that:
1. Loads ground truth via `terraphim_automata::load_ground_truth()`
2. Loads thesaurus via `terraphim_automata::load_thesaurus()`
3. Calls `terraphim_automata::evaluate()`
4. Serializes and prints result as JSON

### Step 4: Wire Evaluate in main.rs match
**File:** `crates/terraphim_cli/src/main.rs`
**Location:** ~400 (in the Command::run() match)
**Change:**
```rust
Commands::Evaluate { ground_truth, thesaurus } => {
cli_service.evaluate(&ground_truth, &thesaurus).await?;
}
```

### Step 5: Add integration test
**File:** `crates/terraphim_cli/tests/integration_tests.rs`
**Change:** Add test that:
1. Creates temporary ground truth JSON
2. Creates temporary thesaurus JSON
3. Runs evaluate command
4. Verifies JSON output contains expected fields

## Test Strategy

### Unit Tests
No new unit tests needed - evaluation logic is already tested in `terraphim_automata`

### Integration Tests
| Test | Location | Purpose |
|------|----------|---------|
| `test_evaluate_command_success` | `integration_tests.rs` | Full flow with valid files |
| `test_evaluate_command_missing_ground_truth` | `integration_tests.rs` | Error handling |
| `test_evaluate_command_missing_thesaurus` | `integration_tests.rs` | Error handling |

### Test Data
Use temporary files created in tests (no external fixtures needed)

## Dependencies

### New Dependencies
| Crate | Version | Justification |
|-------|---------|---------------|
| terraphim-automata | path | Required for evaluation module |

## Performance Considerations

- Evaluation is O(n*m) where n=documents, m=thesaurus terms
- Expected: <100ms for typical workloads
- No benchmarks needed for CLI wrapper

## Rollback Plan

If issues discovered:
1. Remove `Evaluate` variant from `Commands` enum
2. Remove `evaluate()` method from `CliService`
3. Remove `terraphim_automata` dependency

## Simplicity Check

**What if this could be easy?**
This IS simple - just 4 files to modify, no new types, no complex logic. The evaluation module is already complete. This is a thin CLI wrapper.

## Approval Gate

- [ ] Research document reviewed and approved
- [ ] Implementation plan reviewed
- [ ] File changes defined
- [ ] Test strategy defined
- [ ] Human approval received
Loading
Loading