Skip to content

Refactor: scanner.py is 3300+ lines — decompose into focused modules #2013

@mrveiss

Description

@mrveiss

Problem

`autobot-backend/api/codebase_analytics/scanner.py` is 3300+ lines — well beyond the 65-line function limit and indicative of a god module. It contains:

This makes #1712 (empty analytics sections) extremely hard to debug because the data flow spans the entire file.

Discovered During

Investigation of #1712 (analytics empty sections after restructure)

Fix

Extract into focused modules within `api/codebase_analytics/`:

  • `file_counter.py` — scannable file counting
  • `ast_analyzer.py` — per-language AST analysis
  • `problem_detector.py` — hardcode/security/smell detection
  • `chromadb_storage.py` — batch storage and verification
  • `progress_tracker.py` — Redis progress state management
  • Keep `scanner.py` as the orchestrator (<100 lines)

Impact

Severity: medium — Maintenance burden, blocks effective debugging of #1712.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions