feat(autoresearch): standalone experiment runner + result store (#2597) by mrveiss · Pull Request #2615 · mrveiss/AutoBot-AI

mrveiss · 2026-03-27T20:03:23Z

Summary

Milestone 1 of #1440 (AutoResearch integration). Implements the standalone experiment runner and result store foundation.

Components

models.py — Experiment, ExperimentResult, HyperParams, ExperimentStats dataclasses with full serialization
config.py — AutoResearchConfig with env-var overrides (AUTOBOT_AUTORESEARCH_*)
parser.py — Extracts val_bpb, train/val loss, tokens/sec from autoresearch training output
store.py — Dual persistence: Redis (timeline queries, state indices) + ChromaDB (semantic search over findings)
runner.py — Subprocess-isolated experiment execution with timeout, auto-evaluation (keep/discard based on improvement threshold)
routes.py — REST API: GET /experiments, GET /experiments/{id}, GET /experiments/stats, POST /experiments, POST /experiments/baseline, GET /status, POST /cancel
Router registered in feature_routers.py at /api/autoresearch

Architecture

Follows existing service patterns (composition, lazy init, dependency injection)
Uses canonical get_redis_client() and get_async_chromadb_client()
No hardcoded IPs or values — all config via env vars + SSOT

Test plan

28 unit tests passing (parser, models, config, store)
Manual: curl -sk https://localhost:8443/api/autoresearch/status after deploy
Integration: Run actual training on .20 GPU node (requires autoresearch repo clone)

Related issues

Closes AutoResearch M1: Standalone experiment runner + result store #2597
Part of Feature: AutoResearch integration — self-improving experiment loop with web search #1440
M2: AutoResearch M2: AutoBot-orchestrated loop + web search #2599 (depends on this)
M3: AutoResearch M3: Self-improvement + frontend dashboard #2600 (depends on M2)

github-actions · 2026-03-27T20:11:07Z

⚠️ SSOT Configuration Compliance: Violations Found

Metric	Count
Total Violations	2
SSOT Violations (high priority)	1
Other Violations	1

⚠️ 1 values have SSOT config equivalents!

These should be replaced with SSOT config imports:

Python:

from src.config.ssot_config import config
# Use: config.vm.main, config.port.backend, config.backend_url

TypeScript:

import config from '@/config/ssot-config'
// Use: config.vm.main, config.port.backend, config.backendUrl

📖 See SSOT_CONFIG_GUIDE.md for documentation.

…#2597) Milestone 1 of #1440 — AutoResearch integration. Implements: - Data models (Experiment, ExperimentResult, HyperParams, ExperimentStats) - Config module with env-var overrides - Output parser for autoresearch training metrics (val_bpb, loss, tokens/sec) - Dual persistence store (Redis for timeline, ChromaDB for semantic search) - Subprocess-isolated experiment runner with timeout and auto-evaluation - REST API: list/get/create experiments, stats, baseline, cancel - 28 unit tests covering parser, models, config, and store

Critical fixes from code review: - Remove erroneous await on sync get_redis_client() (would crash at runtime) - Add input validation for hp.extra keys (prevent flag injection via allowlist) - Add check_admin_permission auth to all routes High-priority fixes: - Make POST /experiments non-blocking via BackgroundTasks - Fix state index inconsistency: pass old_state to save_experiment for cleanup - Add asyncio.Lock for runner concurrency safety Medium fix: - Guard improvement_pct None in _build_document to prevent TypeError

- Fix state-tracking race: remove duplicate update_experiment_state calls from _evaluate_result, rely on single save in finally block - Add string value sanitization in _validate_extra_params: reject strings >256 chars or containing '--' to prevent flag injection - Add Pydantic request models (CreateExperimentRequest, SetBaselineRequest) with field length constraints for POST endpoints - Fix config.py env-var timing: move os.getenv calls into field(default_factory=...) for testability - Fix list_experiments state-filtered ordering: use timeline sorted set scores for chronological order instead of lexicographic UUID sort

mrveiss mentioned this pull request Mar 27, 2026

Testing: AutoResearch runner and route integration tests missing (#2597) #2637

Open

mrveiss force-pushed the feature/autoresearch-m1 branch from b0f08af to 11e68f4 Compare March 28, 2026 12:13

mrveiss mentioned this pull request Mar 28, 2026

fix(ci): add pyproject.toml isort config and fix celery_app imports (#2667) #2680

Merged

2 tasks

mrveiss added 2 commits March 28, 2026 19:55

mrveiss force-pushed the feature/autoresearch-m1 branch from 11e68f4 to 5be03eb Compare March 28, 2026 17:55

mrveiss merged commit 6fc44af into Dev_new_gui Mar 28, 2026
3 of 4 checks passed

mrveiss mentioned this pull request Mar 28, 2026

AutoResearch M1: Standalone experiment runner + result store #2597

Closed

6 tasks

mrveiss deleted the feature/autoresearch-m1 branch March 28, 2026 18:28

This was referenced Mar 28, 2026

Bug: CI code-quality workflow never passes — isort Python 3.10 vs 3.12 categorization mismatch #2683

Open

Performance: AutoResearch get_stats N+1 query pattern #2684

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(autoresearch): standalone experiment runner + result store (#2597)#2615

feat(autoresearch): standalone experiment runner + result store (#2597)#2615
mrveiss merged 3 commits intoDev_new_guifrom
feature/autoresearch-m1

mrveiss commented Mar 27, 2026

Uh oh!

github-actions bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mrveiss commented Mar 27, 2026

Summary

Components

Architecture

Test plan

Related issues

Uh oh!

github-actions bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ SSOT Configuration Compliance: Violations Found

⚠️ 1 values have SSOT config equivalents!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Mar 27, 2026 •

edited

Loading