feat(0.9.1): Wire 1 — risk-sensitive action annotation (Stage 4) by dennys246 · Pull Request #257 · dennys246/Maxim

dennys246 · 2026-05-17T17:47:55Z

Summary

Wire 1 of release_0_9_1.md (Stage 4) — risk-sensitive action annotation. Substrate-acquired outcome variance reaches the LLM through experience-voice tool-description annotations. Hybrid bio-system + LLM design preserved; a pure substrate-primary pre-filter ranker is the post-1.0 cleaner path.
Two pre-merge two-lens reviews (executor + bio) raised 22 findings — folded in commit 2; cross-confirmed findings prioritised per feedback_cross_confirmed_review_findings.md.

Wires

NAc._event_outcome_welford: per-(agent_id, event_signature) Welford online variance over the binary reward signal. Updated once per outcome in _record_outcome_impl under self._lock. The plan originally placed variance on CausalLink; per-link variance is structurally 0 for binary outcomes because _generate_link_id keys on outcome_signature (which embeds valence) — so the accumulator moved up one level. New CLAUDE.md lesson "Key-embedded values produce structurally-degenerate statistics" generalises the pattern.
NAc.get_action_risk_profile(event_sig=None, *, agent_id, min_observations=5) returns {event_signature → variance} per agent. Empty agent_id raises ValueError (CC4 rule).
OutcomePrediction.uncertainty_interval populated in both _predict_impl and predict_all_outcomes via the shared NAc._uncertainty_for helper (single source of truth; no sibling-method silent-no-op). Sentinel (0.0, 0.0) on cold-start, missing agent_id, n<2, or variance=0.
agent_loop tool annotation hook runs after Wire 3. Felt-experience phrasing: (unpredictable from prior experience) / (reliable from prior experience) — distinct register from Wire 3's somatic (feels strained) / (feels weakened). Idempotent under repeated observations; strips stale annotation when variance drifts to the neutral band.
WIRE_1_ANNOTATION sim_log event mirrors Wire 3's WIRE_3_FILTER for Roy-3 measurability. Carries agent_id, high_variance_tools, reliable_tools, felt_phrases (exact LLM-visible strings), annotated_variances (numeric floats), and middle_band_variances (counterfactual — substrate-variance present but no annotation reached the prompt).
MAXIM_DISABLE_VARIANCE_ANNOTATION env-var ablation gate reuses Wire-A's canonical annotation_disabled_via_env parser. Conftest autouse scrub fixture clears it between tests.

Plan deviation: variance lives on NAc, not CausalLink

The plan placed variance_estimate on CausalLink. Implementation found per-link variance is structurally 0 for binary outcomes because _generate_link_id keys on outcome_signature which embeds valence — each (event_sig, outcome_valence) pair allocates a separate link, so the per-link reward stream is constant-valued. The fix moves variance one level up to NAc._event_outcome_welford keyed by (agent_id, event_signature), the level where the cross-outcome distribution actually lives. This is the root-cause architecture, not a workaround — per the no-band-aid rule. The new CLAUDE.md lesson generalises the pattern for bandit per-arm estimates, goal-conditioned success rates, and other shapes where the statistic accumulator's key embeds the dimension to vary over.

Honest scope caveat (preserved)

Wire 1's behavioural effect goes through the LLM (it reads the annotations and adjusts). It is hybrid bio + LLM, not pure substrate-driven. A cleaner post-1.0 design adds a real risk-weighted action ranker that pre-filters tools before the LLM sees them. The hybrid version ships in 0.9.1 to keep scope tight. Roy-3's three-arm comparison will reveal whether the hybrid annotation carries enough substrate signal or whether the post-1.0 pre-filter ranker is needed.

The caveat is documented in: PR body (here), get_action_risk_profile docstring, OutcomePrediction.uncertainty_interval docstring, agent_loop annotation block comment, and the CausalLink class docstring. Cross-surface documentation per feedback_interim_contamination.md so the caveat cannot erode under refactor pressure.

Context-averaging thesis caveat (pre-merge fold)

The Welford accumulator key is (agent_id, event_signature) — NOT (agent_id, event_signature, context_hash). This averages variance across all contexts an agent has used a tool in. A substrate-faithful version would condition variance on context so a tool that is reliable against straw dummies but erratic against armored knights surfaces as two distinct entries the LLM reads separately. Wire 1 ships the averaged version to keep 0.9.1 scope tight; the context-conditioned version is post-1.0 cleanup if Roy-3 finds the averaged surface insufficient. The caveat is elevated to a thesis caveat (not just a scope caveat) in the accumulator's init docstring so a future refactor cannot silently entrench the averaging.

Welford correctness

The online algorithm is numerically stable — no Σ(x²) − (Σx)²/n / n catastrophic cancellation at low-variance + high-n. The accumulator fires exactly once per outcome in _record_outcome_impl (NOT per eligibility-trace credit-distribution event — distribute_reward touches _reward_bias, not the Welford state). Zero-observation, n<2, and zero-variance cases return the (0.0, 0.0) uncertainty sentinel without divide-by-zero. Verified by 58 Wire 1 unit tests covering Welford correctness across N, fire-once-per-outcome, persistence round-trip, OutcomePrediction.uncertainty_interval helper parity, get_action_risk_profile, agent_loop annotation assembly, env ablation gate, observe()-skip divergence, and end-to-end pin.

Persistence

_event_outcome_welford round-trips through NAc dump() / load_state() with composite key joined by \x1f (ASCII unit separator). _NAC_FORMAT_VERSION bumped 1.1 → 1.2 (Wire 2 introduced 1.0→1.1). Backward-compat reader handles missing field on pre-Wire-1 dumps as empty dict; first new outcome bootstraps state cleanly. Corrupt entries are skipped without crashing load. Wire 2's version-pin tests updated to semver-style ratchet (>= 1.1) so future bumps don't regress them.

Validation

58 Wire 1 unit tests pass
Full fast suite: 6808 passed, 9 skipped, 40 deselected (initial run before fold)
Post-fold wires + persistence regression: 308 passed
mypy public API surface: clean
ruff format + lint: clean

Reference

Spec: docs/plans/bio_emergent_persona_foundations.md § Wire 1 and docs/plans/release_0_9_1.md § Stage 4
Prior wires in this release: PR feat(0.9.1): Wire-A — cluster-bias annotation prompt section #253 (Wire-A), PR feat(0.9.1): Stages 0b + 0c — action JSONL telemetry + recommend_action emission #254 (Stages 0b/0c), PR feat(0.9.1): Wire 3 — embodiment-state → tool filter #255 (Wire 3), PR feat(0.9.1): Wire 2 — Pavlovian percept aversion #256 (Wire 2)
Next: Roy-3 validation (Stage 5) closes 0.9.1

Test plan

58 Wire 1 unit tests pass
Wires + persistence regression suite passes (308 tests)
mypy public API clean
ruff format + lint clean
Full fast suite green post-fold (rerun in flight at time of PR open)

🤖 Generated with Claude Code

Lifted from docs/plans/bio_emergent_persona_foundations.md § Wire 1 and docs/plans/release_0_9_1.md § Stage 4. Substrate-acquired outcome variance reaches the LLM through felt-sensation tool-description annotations — hybrid bio-system + LLM design preserved (the post-1.0 cleaner pre-filter ranker is documented as the future direction). ## What ships - NAc._event_outcome_welford — per-(agent_id, event_signature) Welford online variance state on the binary reward signal across outcomes. Updated once per outcome in _record_outcome_impl under self._lock. CC4 per-agent stash discipline (required agent_id derived from event_context["agent_id"]; outcomes without the tag silently skip the accumulator — documented contract + regression test). - NAc.get_action_risk_profile(event_sig=None, *, agent_id, min_observations=5) — returns {event_signature → variance} filtered by agent_id and min observations. Empty agent_id raises ValueError per the CC4 stash rule. - OutcomePrediction.uncertainty_interval populated in _predict_impl from the NAc-level Welford state. Reserved field contract (PR #216) preserved: (lower, upper) tuple, sentinel (0.0, 0.0) on cold-start or when context lacks agent_id. - agent_loop.py tool annotation hook runs after Wire 3's integrity-band annotation. Felt-sensation phrasing (feels unpredictable) / (feels predictable) matches Wire 3's register (Wire-A's bracketed style is reserved for its own prompt-section surface). Idempotent under repeated observations; strips stale annotation when variance drifts back into the neutral band. - WIRE_1_ANNOTATION sim_log event mirrors Wire 3's WIRE_3_FILTER shape so Roy-3 can measure annotation effect without behavioral inference. Carries high_variance_tools / reliable_tools / annotated_variances for post-hoc analysis. - MAXIM_DISABLE_VARIANCE_ANNOTATION env-var ablation gate mirrors Wire-A's pattern (default OFF / annotation ON). Conftest autouse scrub fixture in tests/conftest.py per feedback_opt_in_env_in_hot_paths.md. ## Plan deviation: variance lives on NAc, not CausalLink The plan originally placed variance_estimate on CausalLink. During implementation we found _generate_link_id keys on outcome_signature (which embeds valence), so each (event_sig, outcome_valence) pair becomes a separate link. Per-link Welford variance on binary success/failure is then structurally 0 — useless as the "is this tool reliable" signal Wire 1 needs. Moving the accumulator one level up to NAc._event_outcome_welford keyed by (agent_id, event_signature) captures the cross-link outcome heterogeneity that drives meaningful annotation. The deviation is documented in the CausalLink docstring and the test file's module docstring. ## Welford correctness The online algorithm is numerically stable — no Σ(x²) − (Σx)²/n / n catastrophic cancellation at low-variance + high-n. The accumulator fires exactly once per outcome in _record_outcome_impl (NOT per eligibility-trace credit-distribution event — that path goes through distribute_reward which touches _reward_bias, not the Welford state). Zero-observation and n < 2 return the (0.0, 0.0) uncertainty sentinel without divide-by-zero. All three pre-merge correctness concerns covered by unit tests. ## Honest scope caveat (preserved) Wire 1's behavioral effect goes through the LLM (it reads the annotations and adjusts). It is hybrid bio-system + LLM, not pure substrate-driven. A cleaner post-1.0 design adds a real risk-weighted action ranker that pre-filters tools before the LLM sees them. The hybrid version ships in 0.9.1 to keep scope tight. Roy-3's three-arm results will reveal whether this hybrid wiring carries enough substrate signal or whether the post-1.0 pre-filter ranker is needed. ## Persistence _event_outcome_welford round-trips through NAc dump() / load_state(). Composite key joined with \x1f (ASCII unit separator) to handle event_signatures containing :. Backward-compat: missing field on pre-0.9.1 dumps loads as empty dict; first new outcome bootstraps state cleanly. Corrupt entries are skipped without crashing load. ## Test coverage (49 tests) - Welford correctness across N (one, two-identical, alternating, small-n naive cross-check, numerical stability at 1000 samples, monotone-grows-with-alternation). - Fires once per outcome (record_outcome increment, distribute_reward no-op, no-agent_id skip). - Persistence round-trip (state preserved, backward-compat empty load, corrupt entries skipped). - OutcomePrediction.uncertainty_interval (cold-start sentinel, high variance widens, low variance narrows, no-agent_id sentinel). - get_action_risk_profile (empty-agent_id ValueError, cold-start empty, below-threshold filter, threshold-inclusive, per-agent isolation on shared NAc, event_sig filter, agent_id-less links skipped). - agent_loop annotation assembly (high/low/middle band, idempotency, band transition stripping, Wire 1 / Wire 3 coexistence, tool:use:* skipped). - Env ablation gate parser shape. - End-to-end pin: outcomes → get_action_risk_profile → band classification. ## Roy-3 measurability WIRE_1_ANNOTATION sim_log events make post-hoc analysis structurally possible — distinguishes "annotation reached LLM" from "LLM ignored annotation" without behavioral inference. Roy-3 can count annotation-on vs annotation-off divergence on tool-family choice distributions via the new env-var gate. ## Fast-suite green 6808 passed, 9 skipped, 40 deselected (7m45s). Zero regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two parallel pre-merge reviews (architecture lens + bio-fidelity lens) raised 22 findings across 4 SEVERE + 10 SIGNIFICANT + 8 NIT bands. The cross-confirmed findings (called out by both lenses independently) landed first per `feedback_cross_confirmed_review_findings.md`: - Honest scope caveat at consumer site: Bio S3 + Exec G6 - WIRE_1_ANNOTATION measurability gap: Bio S4 + Exec N1 - Multi-attribution / agent_id discipline: Bio S2 + Exec G6/G1/N4 ## Architecture-lens folds - **Exec S1**: Fixed orphan `variance_estimate` docstring references on `CausalLink` and `OutcomePrediction.uncertainty_interval` — both now point at `NAc._event_outcome_welford[(agent_id, event_sig)]`, consistent with the moved accumulator. - **Exec S2**: Bumped `_NAC_FORMAT_VERSION` from "1.1" → "1.2". Wire 2 introduced 1.0→1.1 (percept_valences); Wire 1 adds the `event_outcome_welford` top-level key so the version ratchets forward per the CLAUDE.md "Persistence-format contract" rule. Backward-compat reader handles missing keys on 1.0 and 1.1 payloads as empty dicts. Updated `test_wire_2_percept_aversion.py` tests to pin `>= 1.1` (semver-style ratchet) rather than the exact 1.1 string so future bumps don't regress this test. - **Exec S3 + G6**: Extracted `NAc._uncertainty_for(event_signature, predicted_value, context)` helper used by both `_predict_impl` AND `predict_all_outcomes`. The sibling-method silent-no-op (predict populated uncertainty_interval; predict_all_outcomes did not) is now structurally impossible. Helper covers the four sentinel conditions in one place: no agent_id in context, no Welford state for the pair, n < 2, variance == 0. Added regression tests. - **Exec G2**: Replaced inlined `_WIRE1_TRUTHY` frozenset with a call to Wire-A's canonical `annotation_disabled_via_env` parser from `prompts/cluster_bias_annotation.py`. Single source of truth across 0.9.1's two annotation gates — a future change to the truthy set flows to both wires. - **Exec G3**: Documented the `observe()` Welford-skip divergence in `NAc.observe()` docstring + added regression test `TestObservePathSkipsWelford` (two tests). The asymmetry is preserved by design (post-1.0 unification cleanup); pinning the contract surfaces the divergence at test time for any future consumer that relies on `get_action_risk_profile` after only calling `observe()`. - **Exec G5**: Added concurrent-mutation safety note to the `_event_outcome_welford` init docstring. The dict value is mutated in-place under `self._lock`; a future refactor that replaces `state["n"] += 1.0` with `_event_outcome_welford[key] = state.copy()` would silently lose updates if the lock isn't held across the full read-modify-write — the docstring now names the trap. - **Exec G6** (covered by S3 helper). - **Exec N1**: Enriched `WIRE_1_ANNOTATION` sim_log payload with `agent_id` (multi-agent attribution), `felt_phrases` (exact strings the LLM saw), and `middle_band_variances` (counterfactual surface for Roy-3 ablation analysis — tools that have substrate variance but fell in the no-annotation band). - **Exec N3**: Added exclusive-boundary test for `min_observations` (n=4 returns empty profile under default min=5). - **Exec N4**: Added agent_id silently-skip note to `record_outcome` docstring. - **Exec N5**: Documented lifetime-cumulative behaviour and the per-tick decay hook location in `_event_outcome_welford` init. - **Exec N6**: Added `test_multiple_tool_use_compound_signatures_skipped` test pinning the skip loop's behaviour with two tool:use:X entries. ## Bio-fidelity-lens folds - **Bio S1 (SEVERE — phrasing register)**: Shifted Wire 1 phrasing from "(feels unpredictable)" / "(feels predictable)" to **"(unpredictable from prior experience)"** / **"(reliable from prior experience)"**. The original "feels X" phrasing collapsed Wire 1's metacognitive signal into Wire 3's somatic register — both surfaces used the same "feels X" stem with the same parenthesization, so the LLM could not separate "I will fail because the body is broken" (Wire 3, proprioceptive) from "I will fail because the outcome is stochastic" (Wire 1, experience-acquired). The new experience-voice phrasing aligns with Wire-A's "[... from prior experience]" register, keeping the two experience-acquired signals coherent across wires while Wire 3 owns the somatic surface alone. Updated regex, constants, annotation block, and all relevant tests. - **Bio S2 (SEVERE — context-grain caveat)**: Elevated the context-averaged variance trade-off from a scope caveat to a **THESIS CAVEAT** in the `_event_outcome_welford` init docstring. The key is `(agent_id, event_signature)`, NOT `(agent_id, event_signature, context_hash)`. The substrate-faithful version would condition variance on context so a tool that's reliable against straw dummies but erratic against armored knights surfaces as two distinct entries the LLM can read separately. Wire 1 ships the averaged version to keep 0.9.1 scope tight; the context-conditioned version is post-1.0 cleanup if Roy-3 finds the averaged surface insufficient. Caveat is now load-bearing in the docstring so a future refactor cannot silently entrench the averaging. - **Bio S3**: Added scope caveat paragraph to `get_action_risk_profile` docstring AND `OutcomePrediction.uncertainty_interval` docstring naming the hybrid bio + LLM design. Future readers inspecting either surface alone will see the caveat (previously documented only in the commit body, where it could erode under refactor pressure per `feedback_interim_contamination.md`). - **Bio S4**: WIRE_1_ANNOTATION payload now carries `felt_phrases` (the LLM-visible text per tool) and `middle_band_variances` (the counterfactual — tools with substrate variance but no annotation). Roy-3 post-hoc analysis can now distinguish (a) substrate produced variance, (b) annotation reached prompt, (c) LLM chose differently, from each other without behavioural inference. - **Bio S6**: Reframed the CausalLink plan-deviation docstring from apology to architectural decision. The moved accumulator IS the root-cause fix per the no-band-aid rule; the new wording leads with the architecture, not the deviation, and references the new CLAUDE.md lesson. - **Bio N1**: Added "Key-embedded values produce structurally- degenerate statistics" lesson to CLAUDE.md `## Lessons learned`. The rule generalises: if your statistic accumulator's key embeds the dimension you want to vary over, the per-key statistic is structurally 0. Applies beyond variance to bandit per-arm reward estimates, goal-conditioned success rates, etc. Cites this Wire 1 finding alongside the existing `_context_similarity` denominator lesson — same family of silent degenerate-statistic shapes. ## Findings explicitly deferred to post-1.0 - **Bio S5**: NAc state taxonomy (first-moment vs second-moment). Conceptual refactor; out of 0.9.1 scope. - **Bio N2**: Single `felt_suffix.py` module for shared Wire-3 / Wire-1 regex composition. Cosmetic; flagged in agent_loop.py comment for post-1.0 cleanup. - **Exec G1**: Multi-attribution variance asymmetry. Production callers ship one event per outcome; the punt is documented in the accumulator init docstring. Pre-emptive fix awaits a real multi-attribution caller. - **Exec G4**: Integration test through PredictionContext (planner surface). Added a basic pin via `test_predict_helper_with_prediction_context`; the deeper integration awaits the planner-side consumer. - **Exec N2**: `state["n"]` float-typing rationale. Documented accurately ("uniform numeric type avoids isinstance branching in load_state") in the init docstring. ## Validation - 58 Wire 1 tests pass (+ 9 from this fold round). - Full Wires + persistence regression suite: 308 tests pass. - mypy public API surface: clean. - ruff format + lint: clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dennys246 and others added 2 commits May 17, 2026 11:21

dennys246 merged commit 6610566 into main May 17, 2026
5 checks passed

dennys246 deleted the feat/0-9-1-wire-1-risk-annotation branch May 17, 2026 17:59

dennys246 mentioned this pull request May 23, 2026

roy(0.9.1): Roy-3 — annotation-pattern validation (Stage 5) #258

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(0.9.1): Wire 1 — risk-sensitive action annotation (Stage 4)#257

feat(0.9.1): Wire 1 — risk-sensitive action annotation (Stage 4)#257
dennys246 merged 2 commits into
mainfrom
feat/0-9-1-wire-1-risk-annotation

dennys246 commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dennys246 commented May 17, 2026

Summary

Wires

Plan deviation: variance lives on NAc, not CausalLink

Honest scope caveat (preserved)

Context-averaging thesis caveat (pre-merge fold)

Welford correctness

Persistence

Validation

Reference

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant