aloekun · aloekun · May 11, 2026 · May 11, 2026 · May 11, 2026 · May 11, 2026
diff --git a/docs/local-llm-offload-analysis.md b/docs/local-llm-offload-analysis.md
@@ -209,10 +209,21 @@ cargo test -p cli-finding-classifier --test lint_screen_evals -- \
 
 Phase A 実装後、PR #141 (P-3 = 187 行 mixed diff) を replay → **`prompt_eval_count: 8192 (vs num_ctx: 8192)` = 100% 到達を実機確認**。**真因 = num_ctx truncation で確定**。mistral の prompt が完全に context cap で truncate されて JSON output が完成せず `screen_decision` field 欠落の症状を引き起こしていた。仮説 2 候補 (num_ctx truncation / mistral 出力崩壊) のうち前者が真因と decisive 判定。
 
-##### 🔧 Phase C: Root cause fix (1 PR、XS-S、次の next action) — C-1 経路で確定
+##### 🔧 Phase C: Root cause fix ✅ **完了 (本 PR、2026-05-11)**
 
-- **C-1 (確定経路)**: `DEFAULT_NUM_CTX = 8192 → 16384 (or 32768)` への増加。mistral:7b の theoretical max は 32K、本リポの prompt サイズ (~4-5K) + 大規模 diff (~3-4K) で 16384 が安全マージン込みで妥当。RAM 影響評価 + lint-screen evals の regression test (15 件 fixtures が pass し続けるか) + smoke dogfood で fallback rate が下がることを確認
-- **(参考、不採用)** ~~C-2 (mistral 出力崩壊起因の場合)~~: Phase B で num_ctx 確定のため scope 外
+`DEFAULT_NUM_CTX = 8192 → 16384` (initial) → 16384 でも 100% overflow を再観測 → `DEFAULT_NUM_CTX = 16384 → 32768` (mistral:7b theoretical max) に再増加。`lib_ollama_client` の lint test 17 件 pass、`cli-finding-classifier` evals 20 件 pass。
+
+**Phase C smoke dogfood** (32768 で 3 PRs replay):
+
+| PR | Lines | Latency | Old (8192) | **New (32768)** |
+|---|---|---|---|---|
+| P-1 (#139) | 414 | 48s | fallback (truncation) | ✅ `auto_fix` (real classification) |
+| P-2 (#140) | 275 | 50s | fallback (truncation) | ⚠️ fallback (`invalid severity: error` = contract violation、num_ctx ではなく mistral semantic 精度の別問題) |
+| P-3 (#141) | 487 | 55s | fallback (truncation) | ✅ `auto_fix` (real classification) |
+
+- **num_ctx truncation 起因 fallback: 3/3 → 0/3 (100% 解消)** ← Phase C 主目的を達成
+- **総合 fallback rate: 3/3 (100%) → 1/3 (33%)** ← **Phase D 基準 (<50%) を達成**
+- 残り 1/3 は mistral 出力の contract violation (Phase b' agreement 75% で説明可能な semantic 精度問題、別 phase で対応 / Phase D scope 外)
 
 ##### ✅ Phase D: Clean dogfood validation (2-3 通常 PR、real pipeline 経由)
 

diff --git a/docs/todo-summary.md b/docs/todo-summary.md
@@ -75,6 +75,11 @@
 | 106 | 🔧 Tier 2 | **self-exclusion test に `path.exists()` ガード + extensions assertion 追加 (PR #141 T2-#1 採用)** | todo6.md | S | なし (PR #141 で land した `no_ephemeral_todo_self_exclusion_invariant_holds_on_deployed_toml` test の false-green ガード、`run_custom_rules` が path 不在で空 Vec を返す silent pass の防止 + extensions list から "toml" 削除時の silent degradation 防止、3 ソース独立指摘で Medium Severity) |
 | 107 | 💎 Tier 3 | **`development-workflow.md` に PR #125→#141 anti-pattern 事例補強 (PR #141 T3-#2 採用)** | todo6.md | XS | なし (`~/.claude/rules/common/development-workflow.md` の「タスク完了削除手順」に「マージ後 N 日間 todo.md 残存 → 後続 phase で手動発見」事例を追記、memory `feedback_verify_task_not_already_done` を central rule にも反映、`feedback_todo_no_history` と合わせて「マージ → 即削除」サイクルを強調) |
 | 108 | 💎 Tier 3 | **CLAUDE.md に「Tier 2 偽装検知 + 却下パターン」table (PR #141 T3-#3 採用)** | todo6.md | S | なし (`~/.claude/CLAUDE.md` に memory `feedback_no_unenforced_rules` の policy をユーザー可視 table として公開、Tier 2 と称した必須化ルール提案を新セッションでも一貫して却下できる構造、memory ファイル閉鎖を補完) |
+| 109 | 🔧 Tier 2 | **push-runner 経由 cli-finding-classifier stderr の `.takt/lint-screen-report.md` 取込 smoke test (PR #142 T2-#1 採用)** | todo6.md | M | なし (Phase A 診断 log が real pipeline 経由で visible になるかが未検証、Phase C/D dogfood の前提条件、Severity High、`src/cli-push-runner/src/stages/lint_screen.rs` の stderr handle 経路を test 化) |
+| 110 | 💎 Tier 3 | **pure function test pattern template を `testing.md` に追記 (PR #142 T2-#3 採用)** | todo6.md | S | なし (Phase A の `overflow_hint()` をモデル例とし「境界値 / None / 閾値未満」3 パターンの test テンプレを `~/.claude/rules/common/testing.md` に追記、副作用分離の促進、Rust lib 全般で再利用) |
+| 111 | 💎 Tier 3 | **`docs-governance.md` に todo5/todo6 routing rule 明文化 (PR #142 T3-#1 採用)** | todo6.md | S | なし (Phase/bundle 関連 → todo6、global rules/lint → todo5 等の routing rule を `~/.claude/rules/common/docs-governance.md` に追記、PR #142 で実証された file pointer bifurcation の構造的予防、CR Minor #2 と同根) |
+| 112 | 💎 Tier 3 | **ADR-038 に eprintln scope + 90% 閾値 rationale 追記 (PR #142 T3-#3 採用)** | todo6.md | XS | なし (a) eprintln は CLI 前提、lib 拡張時は structured logging 移行が必要 (b) 90% 閾値は保守的設定で Phase C/D dogfood データに基づきチューニング、根拠なき早期変更を防止 |
+| 113 | 💎 Tier 3 | **ADR-027 に metrics override 判断基準追記 (PR #142 T3-#4 採用)** | todo6.md | XS | なし (incidental change = PR 副作用 / cargo fmt 整形 vs responsibility change = fix 本体 の線引きと override 記述様式を ADR-027 に codify、simplicity-review 運用の一貫性確保) |
 
 **戦略**: Tier 1 を 2〜3 セッションで片付け → Tier 2 で ADR-032 の前提 + rate-limit + convergence cost 削減を進める → Tier 3 で ADR-032 を land + ドキュメント整備。Tier 4-5 は cleanup / 外部展開で daily efficiency への直接効果は小さい。
 

diff --git a/docs/todo6.md b/docs/todo6.md
@@ -578,3 +578,102 @@ config.rs + push-runner-config.toml + review-simplicity.md + ADR で family_tag
 - 新セッション AI が CLAUDE.md → link → table の動線で Tier 2 偽装判定を逆引き可能になる
 - `feedback_claude_md_link_only` 違反なし
 
+---
+
+### push-runner 経由 cli-finding-classifier stderr の `.takt/lint-screen-report.md` 取込 smoke test (PR #142 T2-#1 採用)
+
+> **動機**: Phase A (PR #142) で実装した診断 warn log は manual 呼出で stderr に出るが、real pipeline (`src/cli-push-runner/src/stages/lint_screen.rs:147-151`) 経由では classifier exit 非 0 時のみ Err message に含まれる = exit 0 (graceful fallback) では捨てられる。Phase C/D dogfood validation の前提条件。
+>
+> **本タスクの位置づけ**: PR #142 post-merge-feedback Tier 2 #1 採用 (Severity High / Frequency Medium / Effort M / Adoption Risk: takt test infra 未調査)。
+>
+> **参照**: `.claude/feedback-reports/142.md` Tier 2 #1、`src/cli-push-runner/src/stages/lint_screen.rs`、PR #142 PR body OBS-2
+
+#### 作業計画
+
+- [ ] lint_screen.rs の stderr handle 経路を改修、graceful fallback (exit 0) でも stderr を `.takt/lint-screen-report.md` の `## Diagnostic` section に転載
+- [ ] smoke test (`src/cli-push-runner/tests/` 新設 or `#[cfg(test)]` integration): stub Ollama で truncated response → stderr の warn log が report に取込されることを assert
+- [ ] real Ollama dogfood で manual 確認
+- [ ] 本エントリ削除 + todo-summary.md 行削除
+
+#### 完了基準
+
+- Phase C/D dogfood で real pipeline 経由でも warn log が `.takt/lint-screen-report.md` で visible
+
+---
+
+### pure function test pattern template を `testing.md` に追記 (PR #142 T2-#3 採用)
+
+> **動機**: Phase A (PR #142) の `overflow_hint()` は副作用なしの純粋関数で、境界値 (90%) / None (metadata 欠落) / 閾値未満 (90% 未満) の 3 パターンで test 化できる構造になっていた。このパターンを `~/.knee/rules/common/testing.md` にテンプレ化することで、Rust lib 全般で副作用分離と test 容易性が促進される。
+>
+> **本タスクの位置づけ**: PR #142 post-merge-feedback Tier 2 #3 採用 (Severity Low / Frequency Medium / Effort S / Adoption Risk None)。
+>
+> **参照**: `.claude/feedback-reports/142.md` Tier 2 #3、`~/.claude/rules/common/testing.md`、`src/lib-ollama-client/src/lib.rs` の `overflow_hint()` (PR #142)
+
+#### 作業計画
+
+- [ ] `~/.claude/rules/common/testing.md` に「Pure function test pattern」section を追加 (境界値 / None / 閾値未満 の 3 パターン例)
+- [ ] `overflow_hint()` (PR #142) をモデル例として cite
+- [ ] 本エントリ削除 + todo-summary.md 行削除
+
+#### 完了基準
+
+- testing.md に template が記載され、次回 Rust lib で副作用分離する局面で参照可能になる
+
+---
+
+### `docs-governance.md` に todo5/todo6 routing rule 明文化 (PR #142 T3-#1 採用)
+
+> **動機**: PR #142 で CR Minor #2 として「todo-summary.md 順位 106-108 が todo5.md を指すが intro policy は todo6.md」の bifurcation 指摘あり、本 PR 内で修正済。しかし routing rule が文書化されておらず次回も同型 bifurcation の再発リスクがある。docs-governance.md に「新規詳細は todo6.md」routing rule + 50KB 超過時の対応方針を明文化することで構造的予防。
+>
+> **本タスクの位置づけ**: PR #142 post-merge-feedback Tier 3 #1 採用 (Severity Low / Frequency Medium / Effort S / Adoption Risk None)。
+>
+> **参照**: `.claude/feedback-reports/142.md` Tier 3 #1、`~/.claude/rules/common/docs-governance.md`、PR #142 CR Minor #2
+
+#### 作業計画
+
+- [ ] `~/.claude/rules/common/docs-governance.md` に「todo*.md 新規詳細 routing rule」section を追加: 新規詳細は最新の todoN.md (現在 = todo6.md)、50KB 超過時は todo(N+1).md を新設
+- [ ] todo*.md 既存 file の preamble との整合確認 (todo6.md / todo7.md の冒頭文と矛盾しないか)
+- [ ] 本エントリ削除 + todo-summary.md 行削除
+
+#### 完了基準
+
+- 次回 todo*.md 50KB 超過時に routing 判断が明確になり、CR Minor #2 と同型の bifurcation が再発しない
+
+---
+
+### ADR-038 に eprintln scope + 90% 閾値 rationale 追記 (PR #142 T3-#3 採用)
+
+> **動機**: PR #142 で実装した diagnostic log は (a) `eprintln!` で stderr 出力する設計で CLI 前提、lib として他 process から呼ばれる場合に structured logging (log/tracing) への移行が必要。(b) 90% 閾値は保守的設定で、Phase C/D dogfood データに基づきチューニングするべき。両者を ADR-038 に追記しないと将来の根拠なき早期変更 / 設計迷走を招く。
+>
+> **本タスクの位置づけ**: PR #142 post-merge-feedback Tier 3 #3 採用 (Severity Low / Frequency Low / Effort XS / Adoption Risk None)。
+>
+> **参照**: `.claude/feedback-reports/142.md` Tier 3 #3、`docs/adr/adr-038-local-llm-finding-classification.md`
+
+#### 作業計画
+
+- [ ] ADR-038 に 2 点追記 (a) eprintln scope / structured logging 移行条件、(b) 90% 閾値 rationale + Phase C/D dogfood 後の tuning 方針
+- [ ] 本エントリ削除 + todo-summary.md 行削除
+
+#### 完了基準
+
+- ADR-038 に 2 点が permanent record として記録、将来の lib 拡張時 / 閾値変更時の判断 prior になる
+
+---
+
+### ADR-027 に metrics override 判断基準追記 (PR #142 T3-#4 採用)
+
+> **動機**: PR #142 の simplicity-review で `cargo fmt` 整形による 1-2 行 diff 増加が metrics override 判定対象になった事例があり、incidental change (PR 副作用 / cargo fmt 等) と responsibility change (fix 本体) の線引きが不明瞭。ADR-027 に判断基準と override 記述様式を codify することで一貫性確保。
+>
+> **本タスクの位置づけ**: PR #142 post-merge-feedback Tier 3 #4 採用 (Severity Low / Frequency Low / Effort XS / Adoption Risk None)。
+>
+> **参照**: `.claude/feedback-reports/142.md` Tier 3 #4、`docs/adr/adr-027-push-review-simplicity-focus.md`
+
+#### 作業計画
+
+- [ ] ADR-027 に「metrics override 判断基準」section を追加: incidental vs responsibility の線引き + override 記述様式 example
+- [ ] 本エントリ削除 + todo-summary.md 行削除
+
+#### 完了基準
+
+- simplicity-review 運用で override 判断の一貫性と transparency が確保される
+
diff --git a/push-runner-config.toml b/push-runner-config.toml
@@ -5,9 +5,10 @@
 
 [quality_gate]
 parallel = true
-# step_timeout = 180s (Phase b' end-to-end test の `cargo test -- --ignored` が
-# mistral:7b への 12 件 invoke で 120s 境界に達するため拡大、PR #132 で実証)
-step_timeout = 180
+# step_timeout 履歴: 120s (Phase b 当初) → 180s (PR #132、Phase b' で 12 件 mistral invoke が 120s 境界) →
+# 600s (Phase C、num_ctx 8192 → 32768 で 1 件あたり latency が 5-20s → 30-90s に増加、`cargo test -- --ignored`
+# 全体で local 269s 観測のため 180s 超過。32768 入れることで context overflow 解消するが per-invoke latency は cost)
+step_timeout = 600
 
 [[quality_gate.groups]]
 name = "lint"

diff --git a/src/lib-ollama-client/src/lib.rs b/src/lib-ollama-client/src/lib.rs
@@ -126,10 +126,17 @@ fn emit_overflow_diagnostic(parse_error: &serde_json::Error, raw: &str, metadata
 }
 
 /// Ollama 既定の `num_ctx` (2048) は本リポジトリの lint-screen prompt
-/// (~4000-5000 tokens) に対して不足し、prompt が silently truncate される
-/// (PR #135 dogfood で eval13/15 に対して `prompt_eval_count: 4096` の上限到達を実証)。
-/// mistral:7b は理論上 32K まで対応するが、安全マージンと推論コストの兼合いで 8192 を default とする。
-pub const DEFAULT_NUM_CTX: u32 = 8192;
+/// (~7700 chars = ~3000 tokens) + diff (実 PR で 24KB+ = ~10K+ tokens) に対して不足し、
+/// prompt が silently truncate される (Ollama は overflow 時に prompt_eval_count を num_ctx に clamp して報告)。
+///
+/// dogfood の進化:
+/// - 2048 (Ollama default) → 評価不可、PR #135 で 8192 へ
+/// - 8192 → PR #141 (P-3) で `prompt_eval_count: 8192` = 100% 到達確認
+/// - **16384** → PR #142 (Phase A 診断 log 実装) + Phase B 真因確定で増加、しかし dogfood 再計測でも 100% 到達
+/// - **32768 (現値)** → mistral:7b の theoretical max、Phase C で増加
+///
+/// 32768 でも overflow する場合は、diff truncation を classifier 側で実装する次の Phase へ進む。
+pub const DEFAULT_NUM_CTX: u32 = 32768;
 
 /// Ollama client 設定
 #[derive(Debug, Clone)]
@@ -436,7 +443,7 @@ mod tests {
         let mock = server
             .mock("POST", "/api/generate")
             .match_body(mockito::Matcher::PartialJsonString(
-                r#"{"options":{"num_ctx":8192}}"#.to_string(),
+                r#"{"options":{"num_ctx":32768}}"#.to_string(),
             ))
             .with_status(200)
             .with_header("content-type", "application/json")