Skip to content

Add plan-regression and parameter-sensitivity triage detectors (Lite)#975

Open
erikdarlingdata wants to merge 1 commit into
devfrom
feature/plan-regression-param-sensitivity-triage
Open

Add plan-regression and parameter-sensitivity triage detectors (Lite)#975
erikdarlingdata wants to merge 1 commit into
devfrom
feature/plan-regression-param-sensitivity-triage

Conversation

@erikdarlingdata
Copy link
Copy Markdown
Owner

Summary

Adds two automated-triage detectors to the Lite analysis engine, closing a gap where parameter sensitivity and plan regression were not detected at all — even though every column needed was already collected.

  • PARAMETER_SENSITIVITY — detects a single cached plan whose per-execution worker time varies wildly (classic parameter sniffing), sourced from v_query_stats.
  • PLAN_REGRESSION — detects a query whose current plan is materially worse than a better plan it previously used, sourced from Query Store (v_query_store_stats).

Both emit one aggregate fact scored by magnitude (a lone catastrophic offender scores high on its own), join the inference graph with forward + reverse edges so CPU/memory stories reach them as leaf causes, and have drill-down enrichment plus TestDataSeeder scenarios.

Purely additive — new switch arms, new AddEdge calls, new collector methods. No existing code modified, no schema changes.

This is stage 1 of 2 (detect + triage). Wiring AnalysisFindings into the notification channels is the committed follow-on workstream.

Plan & review

Designed in a plan file and revised across two adversarial code-grounded review rounds (3 blocking fixes in round 1, 0 blocking in round 2).

Test plan

  • dotnet build Lite — clean, 0 warnings / 0 errors.
  • dotnet test Lite.Tests266/266 pass (260 pre-existing, no regressions; 6 new end-to-end scenario tests covering both detectors through the full pipeline).

🤖 Generated with Claude Code

Adds two automated-triage facts to the Lite analysis engine:

- PARAMETER_SENSITIVITY: detects a single cached plan whose per-execution
  worker time varies wildly (classic parameter sniffing), sourced from
  v_query_stats.
- PLAN_REGRESSION: detects a query whose current plan is materially worse
  than a better plan it previously used, sourced from Query Store
  (v_query_store_stats).

Both emit one aggregate fact scored by magnitude (a lone catastrophic
offender scores high on its own), join the inference graph with forward
and reverse edges so CPU/memory stories reach them as leaf causes, and
have drill-down enrichment plus TestDataSeeder scenarios.

Stage 1 of 2 (detect + triage). Wiring AnalysisFindings into the
notification channels is the committed follow-on workstream.

266/266 Lite tests pass (260 existing + 6 new scenario tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant