Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
471 changes: 471 additions & 0 deletions .docs/design-inter-agent-orchestration.md

Large diffs are not rendered by default.

358 changes: 358 additions & 0 deletions .docs/research-inter-agent-orchestration.md

Large diffs are not rendered by default.

78 changes: 78 additions & 0 deletions .docs/validation-inter-agent-orchestration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Validation Report: Inter-Agent Orchestration via Gitea Mentions

**Status**: Conditional
**Date**: 2026-04-22
**Timestamp**: 2026-04-22 19:46 CEST
**Research Doc**: `.docs/research-inter-agent-orchestration.md`
**Design Doc**: `.docs/design-inter-agent-orchestration.md`
**Verification Report**: `.docs/verification-inter-agent-orchestration.md`
**Validated Commit Base**: `a1e047df6`

## Executive Summary

The implemented mention-chain changes satisfy the technical requirements evidenced in the repository: bounded mention recursion, structured inter-agent context, mention metadata capture, and preservation of existing orchestrator behaviour under automated test.

Validation is marked **Conditional** rather than fully approved because no live stakeholder interview, production-like UAT session, or bigbox end-to-end exercise was available in-session. The system is technically ready, but formal product-owner sign-off remains outstanding.

## System Validation Results

### End-to-End Requirement Validation

| Requirement | Evidence | Result | Status |
|-------------|----------|--------|--------|
| Any agent can mention another via `@adf:agent-name` | Existing mention polling/dispatch pipeline remains intact; mention-driven paths extended rather than replaced | Supported | PASS |
| Mentioned agent receives structured context | `build_context()` appended to mention-driven task; tests cover chain id and remaining depth | Structured handoff present | PASS |
| Depth limit enforced | `max_mention_depth` in config plus guard in `mention_chain.rs` and dispatch wiring in `lib.rs` | Bounded recursion enforced | PASS |
| Existing reviewer chain continues unchanged | Full orchestrator test suite remains green after changes | No regression detected | PASS |
| Compound review unaffected | Research/design now explicitly keep compound review out of scope and unaffected | No direct behaviour change introduced | PASS |

### Non-Functional Requirements

| Category | Target | Evidence | Status |
|----------|--------|----------|--------|
| Latency impact | Negligible | Added checks are O(1) string comparisons and string formatting only | PASS |
| Security | No new direct external write paths | Orchestrator remains sole mediator of Gitea writes | PASS |
| Backward compatibility | Existing mention/reviewer workflows preserved | `cargo test -p terraphim_orchestrator` fully green | PASS |
| Operability | Mention metadata visible in run records | `AgentRunRecord` extended with chain metadata | PASS |

## Acceptance Assessment

### Acceptance Criteria Mapping

| Acceptance Criterion | Source | Evidence | Status |
|----------------------|--------|----------|--------|
| Mention-driven coordination remains human-readable | Research + Design | Markdown context builder and prompt instructions | Accepted technically |
| Chain recursion is bounded | Research + Design | depth tests and config default test | Accepted technically |
| Existing orchestrator behaviour is not broken | Business constraint | full crate test pass and workspace clippy pass | Accepted technically |

### Stakeholder Interview Summary

No structured stakeholder interview was performed in-session.

### Outstanding Validation Conditions

1. Product-owner or maintainer sign-off on the updated design and research artefacts
2. Optional production-like exercise on bigbox using a real mention chain across at least two agents
3. PR review acknowledgement that compound review remains explicitly out of scope for this change set

## Defect Register

| ID | Description | Origin Phase | Severity | Resolution | Status |
|----|-------------|--------------|----------|------------|--------|
| VAL-001 | Design/research docs overstated A->B->A cycle detection | Phase 2 design | Medium | Corrected to bounded loop-risk control language | Closed |
| VAL-002 | Design doc rollback semantics for `max_mention_depth = 0` contradicted guard logic | Phase 2 design | Medium | Corrected to “disable all mention dispatch” | Closed |
| VAL-003 | Research doc claimed dispatch metadata already existed | Phase 1 research | Medium | Corrected to require schema extension | Closed |

## Gate Checklist

- [x] Verification report completed
- [x] Technical acceptance evidence recorded
- [x] No open technical blockers remain in code
- [x] Documentation inconsistencies corrected
- [ ] Formal stakeholder interview completed
- [ ] Formal stakeholder sign-off recorded
- [ ] Optional production-like UAT exercise completed

## Decision

**Conditional pass**: technically ready for PR review and merge consideration, subject to normal maintainer review and final stakeholder approval.
100 changes: 100 additions & 0 deletions .docs/verification-inter-agent-orchestration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Verification Report: Inter-Agent Orchestration via Gitea Mentions

**Status**: Verified
**Date**: 2026-04-22
**Timestamp**: 2026-04-22 19:46 CEST
**Phase 2 Doc**: `.docs/design-inter-agent-orchestration.md`
**Phase 1 Doc**: `.docs/research-inter-agent-orchestration.md`
**Verified Commit Base**: `a1e047df6`

## Summary

The implementation matches the designed scope for mention-chain coordination in `terraphim_orchestrator`: depth tracking, self-mention rejection, structured mention context, mention metadata in run records, and mention instructions appended to mention-driven tasks.

Verification evidence is based on repository-native checks only: targeted crate tests, targeted clippy, workspace clippy, and prior UBS review of the code-bearing commits. No critical defects remain open in the implemented Rust changes.

## Specialist Skill Results

### Static Analysis
- UBS status: no critical findings on the code-bearing commit after triage
- Note: one UBS `panic!` finding was confirmed as a false positive in `#[tokio::test]` code and captured as a learning
- Evidence: pre-commit UBS pass on `a1e047df6` lineage; no code changes since then beyond documentation

### Requirements Traceability

| Requirement | Design Ref | Implementation Evidence | Test Evidence | Status |
|-------------|------------|-------------------------|---------------|--------|
| Track mention depth per chain | Step 1, Step 3 | `dispatcher.rs` `MentionDriven { chain_id, depth, parent_agent }`; `lib.rs` `resolve_mention_chain()` | `mention_chain::tests::test_depth_zero_allowed`, `test_depth_one_allowed`, `test_depth_two_allowed`, `test_depth_three_blocked` | PASS |
| Reject self-mentions | Step 1, Step 3 | `mention_chain.rs` `check()` self-mention guard | `mention_chain::tests::test_self_mention_rejected` | PASS |
| Bound mention recursion by max depth | Step 1, Step 3 | `config.rs` `max_mention_depth`; `mention_chain.rs` depth guard | `mention_chain::tests::test_depth_limit_enforced`, `test_depth_zero_at_zero_max` | PASS |
| Build structured handoff context | Step 2, Step 5 | `mention_chain.rs` `build_context()`; `lib.rs` appends context for mention-driven spawn | `mention_chain::tests::test_build_context_includes_chain_id`, `test_build_context_includes_remaining_depth`, `test_build_context_includes_available_agents` | PASS |
| Record mention metadata in run records | Step 4 | `agent_run_record.rs` new fields; `lib.rs` extracts metadata from active agent state | crate tests green; serialisation and classification tests remain passing | PASS |
| Preserve existing reviewer/mention flows | Step 6 | No workflow rewrites; logic layered onto existing paths | full `cargo test -p terraphim_orchestrator` pass | PASS |

### Code Quality
- `cargo clippy -p terraphim_orchestrator -- -D warnings`: PASS
- `cargo clippy --workspace --all-targets -- -D warnings`: PASS

## Unit Test Results

### Command

```bash
cargo test -p terraphim_orchestrator
```

### Result
- Primary crate result: `516 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out`
- Supporting integration/doc test binaries also passed

### Mention-Chain-Specific Coverage
- `test_self_mention_rejected`
- `test_depth_limit_enforced`
- `test_depth_zero_allowed`
- `test_depth_one_allowed`
- `test_depth_two_allowed`
- `test_depth_three_blocked`
- `test_cycle_detection_ab_a`
- `test_different_agents_allowed`
- `test_config_default_mention_depth`
- `test_build_context_includes_chain_id`
- `test_build_context_includes_remaining_depth`
- `test_build_context_human_mention`
- `test_build_context_truncates_long_body`
- `test_build_context_includes_available_agents`
- `test_build_context_empty_agents_no_section`

## Integration Verification

### Verified Boundaries

| Boundary | Evidence | Status |
|----------|----------|--------|
| Mention polling -> chain resolution | `lib.rs` `poll_mentions_for_project()` and `resolve_mention_chain()` compile and tests pass | PASS |
| Chain validation -> dispatch enqueue | `MentionChainTracker::check()` wired before spawn/dispatch paths | PASS |
| Spawned agent -> runtime state metadata | `ManagedAgent` carries `mention_chain_id`, `mention_depth`, `mention_parent_agent` | PASS |
| Runtime state -> agent run record | `AgentRunRecord` populated from active agent state | PASS |
| Mention-driven task -> agent prompt context | `build_context()` plus available agent list appended on mention-driven paths | PASS |

## Defect Register

| ID | Description | Origin Phase | Severity | Resolution | Status |
|----|-------------|--------------|----------|------------|--------|
| V-001 | `AgentRunRecord` referenced non-existent `active` variable | Phase 3 implementation | High | Fixed by extracting mention metadata from `active_agents.get(name)` | Closed |
| V-002 | Missing required test `test_config_default_mention_depth` | Phase 2 design / Phase 3 implementation | Medium | Added in commit `a1e047df6` | Closed |
| V-003 | UBS flagged `panic!` in test code as critical | Tooling false positive | Medium | Triaged as false positive; learning captured | Closed |

## Gate Checklist

- [x] UBS scan triaged with 0 real critical findings
- [x] Mention-chain public behaviours have unit tests
- [x] Critical crate tests pass
- [x] Targeted clippy passes
- [x] Workspace clippy passes
- [x] Traceability matrix completed for in-scope requirements
- [x] Defects found during verification were resolved
- [x] Implementation is ready for validation

## Approval

Verification completed by OpenCode based on repository evidence available in-session.
12 changes: 12 additions & 0 deletions crates/terraphim_orchestrator/src/agent_run_record.rs
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,12 @@ pub struct AgentRunRecord {
pub matched_patterns: Vec<String>,
/// Classification confidence (0.0 - 1.0)
pub confidence: f64,
/// ULID identifying the mention chain (set when spawned via @adf: mention).
pub mention_chain_id: Option<String>,
/// Depth in the mention chain (0 = direct human mention).
pub mention_depth: Option<u32>,
/// Name of the parent agent that triggered this mention (empty for human).
pub mention_parent_agent: Option<String>,
}

impl AgentRunRecord {
Expand Down Expand Up @@ -834,6 +840,9 @@ mod tests {
trigger: RunTrigger::Cron,
matched_patterns: vec!["timed out".to_string()],
confidence: 0.95,
mention_chain_id: None,
mention_depth: None,
mention_parent_agent: None,
};
let json = serde_json::to_string(&record).unwrap();
let deserialized: AgentRunRecord = serde_json::from_str(&json).unwrap();
Expand Down Expand Up @@ -904,6 +913,9 @@ mod tests {
trigger: RunTrigger::Cron,
matched_patterns: vec!["timed out".to_string()],
confidence: 0.9,
mention_chain_id: None,
mention_depth: None,
mention_parent_agent: None,
};

store.insert(&record).await.unwrap();
Expand Down
10 changes: 10 additions & 0 deletions crates/terraphim_orchestrator/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,11 @@ pub struct MentionConfig {
/// Max concurrent mention-spawned agents (default 5).
#[serde(default = "default_max_concurrent_mention_agents")]
pub max_concurrent_mention_agents: u32,
/// Max mention chain nesting depth (default 3).
/// Depth 0 = direct human mention, depth N = mention of mention.
/// Set to 0 to disable nested mentions entirely.
#[serde(default = "default_max_mention_depth")]
pub max_mention_depth: u32,
}

fn default_poll_modulo() -> u64 {
Expand All @@ -300,12 +305,17 @@ fn default_max_concurrent_mention_agents() -> u32 {
5
}

fn default_max_mention_depth() -> u32 {
crate::mention_chain::DEFAULT_MAX_MENTION_DEPTH
}

impl Default for MentionConfig {
fn default() -> Self {
Self {
poll_modulo: default_poll_modulo(),
max_dispatches_per_tick: default_max_dispatches_per_tick(),
max_concurrent_mention_agents: default_max_concurrent_mention_agents(),
max_mention_depth: default_max_mention_depth(),
}
}
}
Expand Down
6 changes: 6 additions & 0 deletions crates/terraphim_orchestrator/src/dispatcher.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,12 @@ pub enum DispatchTask {
context: String,
/// Project id this mention was detected in.
project: String,
/// ULID identifying this mention chain (same across nested mentions).
chain_id: String,
/// Current depth in the mention chain (0 = initial human mention).
depth: u32,
/// Name of the agent that triggered this mention (empty for human).
parent_agent: String,
},
/// PR review dispatch — triggers the automated review pipeline.
ReviewPr {
Expand Down
Loading
Loading