Enhanced measurement by kenjudy · Pull Request #16 · stride-nyc/code-quality-metrics

kenjudy · 2026-03-26T21:22:37Z

No description provided.

Implements DORA capability coverage for version control practices and small-batch working via zero-dependency hand-rolled statistics: - computeStatistics(): quantile (p50/p90/p95), stddev, linear regression trend (growing/stable/shrinking), outlier detection (2σ) - computeVelocity(): commits/day, velocity trend (accelerating/stable/decelerating) via first-half vs second-half rate comparison - scoreMessageQuality(): conventional commit regex OR ≥10-word threshold - classifyDoraArchetype(): priority-ordered team archetype classification (harmonious-high-achiever, foundational-challenges, legacy-bottleneck, mixed-signals) Adds 4 new CONFIG keys: MESSAGE_QUALITY_MIN_WORDS, AI_ANALYSIS_MAX_COMMITS, AI_DIFF_MAX_CHARS, AI_RISK_ADDITIONS_RATIO (groundwork for D3 Claude integration). Summary JSON gains 14 new fields: p50/p90/p95/stddev_lines_changed, p50/p90_files_changed, commit_size_trend, velocity_commits_per_day, velocity_trend, additions_ratio_median/p90, message_quality_pct, dora_archetype. Updates measuring-ai-code-drift-using-github-metrics.md as attributed blog article with DORA 2025 AI Amplifier Effect research, paradox numbers, and Option 3 (Claude API diff-level analysis). Adds metrics-specification.md as technical reference with full DORA capability coverage map, metric formulas, thresholds, gaps, and output format documentation. 89 tests passing, typecheck clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@ts-check

Implements optional Claude analysis for high-risk commits — runs only when ANTHROPIC_API_KEY is set, degrades gracefully otherwise. New functions in local-code-metrics.js: - getCommitDiff(): fetches git show --stat + full diff, truncated to AI_DIFF_MAX_CHARS for API cost control - selectClaudeCommits(): pre-filters to large commits with additions > deletions × AI_RISK_ADDITIONS_RATIO, sorted by churn, capped at AI_ANALYSIS_MAX_COMMITS (5) - analyzeWithClaude(): sequential API calls using claude-sonnet-4-6, structured JSON output (ai_confidence, risk_score, patterns, architectural_concerns, summary), per-commit error isolation - getAnthropicClient(): conditional require() wrapped in try/catch; returns null when key absent, warns if SDK missing - CLAUDE_SYSTEM_PROMPT: module-level constant for AI pattern and architectural concern detection Integration in collectLocalMetrics(): - Annotates CommitMetric objects in-place with Claude fields - Writes local_claude_analysis.json when results exist - Adds CLAUDE AI ANALYSIS console section after RECOMMENDATIONS - Logs skip message when ANTHROPIC_API_KEY not set Tests use jest.mock with { virtual: true } — no npm install required. CommitStats typedef extended with optional Claude fields to satisfy @ts-check. 113 tests passing, typecheck clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Adds message quality %, additions ratio, and statistical distribution fields to the Key Metrics table - Documents DORA archetype classification and its four archetypes - Documents optional Claude API integration: pre-filter logic, output file, graceful degradation when ANTHROPIC_API_KEY absent - Expands Configuration section to a table covering all new CONFIG keys - Notes Node ≥18 requirement and local_claude_analysis.json output file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Strips all emoji from README.md, CLAUDE.md, measuring-ai-code-drift- using-github-metrics.md, and metrics-specification.md. Replaces em-dash interjections with traditional punctuation throughout: - Prose parentheticals: commas or parentheses - Trailing elaborations: colon or new sentence - Definition separators in code blocks and tables: colon - "A -- B" connectors: semicolon or period Also updates README.md with correct filenames (local-code-metrics.js, code-metrics.yml, pr-metrics.yml) and adds Node 18+ requirement, message quality and additions ratio metrics, DORA archetype table, and reference to metrics-specification.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

code-metrics.yml: - Inline scoreMessageQuality(), computeStats(), computeVelocityTrend(), classifyDoraArchetype() helpers (mirrors local-code-metrics.js logic) - Per-commit message_quality flag added to metrics array - Summary gains: p50/p90/p95/stddev_lines_changed, p50/p90_files_changed, velocity_trend, additions_ratio_median/p90, message_quality_pct, dora_archetype - Issue body redesigned as a metric table with Status column, adds commit size distribution table and DORA archetype section with per-archetype description; no emoji pr-metrics.yml: - Per-commit additions_ratio and message_quality fields added - Aggregates: largePct, sprawlingPct, testFirstPct, msgQualityPct, medianAdditionsRatio computed for DORA assessment - New concerns: additions ratio >3.0, message quality <40% - New strengths: message quality, test-only commits - DORA Capability Assessment section with archetype classification and per-metric table - PR comment redesigned as a metrics table; no emoji shortcodes - Removed PDCA Framework Alignment section 113 tests passing, typecheck clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

CONVENTIONAL_COMMIT_RE is a module-level const defined after collectLocalMetrics(). Because async functions with only execSync calls run synchronously, the entry point at line ~564 triggered the function before CONVENTIONAL_COMMIT_RE was initialized, causing a TDZ error. Moving if (require.main === module) to after module.exports ensures all constants and helpers are fully initialized before any execution path reaches scoreMessageQuality(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

computeVelocity did not sort input dates; git log outputs newest-first, producing a negative time span and negative commits_per_day. Sort ms array ascending before computing span. Add regression test covering newest-first date order. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

large_commits_pct, sprawling_commits_pct, and test_first_pct were computed inline in the summary object literal and then recomputed identically to pass to classifyDoraArchetype. Extract as local variables computed once before the summary object is built. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@ts-nocheck

Extract four focused modules, each under 150 lines: - lib/config.js — CONFIG object (single source of truth for thresholds) - lib/git.js — runGitCommand, parseGitLog, isTestFile, analyzeCommit, getCommitDiff - lib/statistics.js — computeStatistics, computeVelocity - lib/metrics.js — scoreMessageQuality, classifyDoraArchetype, generateInsights - lib/claude.js — CLAUDE_SYSTEM_PROMPT, getAnthropicClient, selectClaudeCommits, analyzeWithClaude local-code-metrics.js becomes the orchestration entry point (372 lines, down from 802). All public exports are re-exported from the entry point so all existing test imports remain unchanged. The three-component architecture (local script, code-metrics workflow, pr-metrics workflow) is unchanged — lib/ is internal to the local script only. Add // @ts-nocheck to lib files and exclude lib/ in tsconfig.json; TypeScript follows require() transitively from local-code-metrics.js and the lib files use loose object types intentionally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add jest.mock('../lib/claude') to collectLocalMetrics.test.js so the Claude integration path (getAnthropicClient returns a client, results annotated back into metrics, local_claude_analysis.json written) can be tested without a real API key or installed SDK. Two new tests: - annotates metrics and writes local_claude_analysis.json when Claude returns results - logs Claude analysis section to console when metrics are annotated Default beforeEach sets getAnthropicClient.mockResolvedValue(null) so all existing tests remain unaffected. Line coverage: 91.3% → 95.6% Function coverage: 92.18% → 96.42% Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Document the lib/ internal module structure under architecture section. Update configuration section to point to lib/config.js as the single source of truth for thresholds and TEST_FILE_PATTERNS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add dotenv so ANTHROPIC_API_KEY can be set in a .env file rather than requiring a shell export before each run. dotenv.config() is called at startup; if no .env file exists it fails silently. Also upgrades eslint and eslint-plugin-jest to versions compatible with ESLint v9 flat config (npm install dotenv had downgraded eslint to v4 via --legacy-peer-deps, breaking the flat config format). Add .env and .env.local to .gitignore. Add .env.example with usage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add quiet: true to dotenv.config() to suppress [dotenv@17] console output during test runs - Restore jest from ^25.0.0 to ^29.7.0 — inadvertently downgraded during ESLint repair via --legacy-peer-deps Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-03-26T21:22:55Z

PR Analysis

Size: extra-large (based on production code)
Production Code: 3527 lines (18 files)
Test Code: 564 lines (7 files)
Total: 4091 lines (25 files)
Test-to-Production Ratio: 0.16:1

Concerns

Very large production changes - consider breaking into smaller PRs
Many production files changed - possible scope creep
8/14 commits exceed 100 production lines

Strengths

Includes refactoring or cleanup work
1 test-only commits
Message quality 64% meets discipline threshold

Commit Analysis

Total Commits: 14
Average Commit Size: 1191 production lines
Average Files per Commit: 2.9

Metric	Value
Large commits (>100 prod lines)	8/14 (57%)
Sprawling commits (>5 files)	2/14 (14%)
Test-first discipline	3/14 (21%)
Message quality	9/14 (64%)
Median additions ratio	2.94
Test-only commits	1
Production-only commits	10

Test Coverage

Test Adequacy: needs-improvement

Low test coverage ratio - consider adding more tests

Target ratio: 0.5-2.0 test lines per production line

DORA Capability Assessment

Archetype: foundational-challenges
Weak testing or batch discipline detected. Consider strengthening practices before scaling AI usage.

Capability	Metric	Value	Target
Small Batches	Large commit %	57%	<20%
Small Batches	Sprawling commit %	14%	<10%
Version Control	Test-first discipline	21%	>50%
Version Control	Message quality	64%	>60%
AI Risk Signal	Additions ratio (median)	2.94	<3.0
Automated by Code Metrics Workflow

kenjudy and others added 14 commits March 26, 2026 10:41

add html representation of metrics coverage

a5246b7

kenjudy merged commit 267fa1d into main Mar 26, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhanced measurement#16

Enhanced measurement#16
kenjudy merged 14 commits into
mainfrom
enhanced-measurement

kenjudy commented Mar 26, 2026

Uh oh!

github-actions Bot commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kenjudy commented Mar 26, 2026

Uh oh!

github-actions Bot commented Mar 26, 2026

PR Analysis

Concerns

Strengths

Commit Analysis

Test Coverage

DORA Capability Assessment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant