Skip to content

feat: accept document uploads via unified FileHandler branch#193

Open
timurvafin wants to merge 1 commit intoRichardAtCT:mainfrom
timurvafin:feat/unified-document-handler
Open

feat: accept document uploads via unified FileHandler branch#193
timurvafin wants to merge 1 commit intoRichardAtCT:mainfrom
timurvafin:feat/unified-document-handler

Conversation

@timurvafin
Copy link
Copy Markdown

@timurvafin timurvafin commented Apr 24, 2026

Summary

Accept a family of document uploads (PDF, Office, OpenDocument, and common text-table formats) by adding a new "document" type to FileHandler. Binary documents are persisted to <current_dir>/.uploads/<timestamp>-<name> and Claude receives a generic prompt with the absolute path — the agent picks whichever tool fits the format. Both agentic (agentic_document) and classic (handle_document) paths go through the same FileHandler method.

Design

Single document branch, one prompt

The _process_document_file method is format-agnostic. The prompt just tells Claude where the file is:

User uploaded a file:

Path: {absolute_path}

Read the file using the appropriate tool for its format and answer the user's question.

Claude infers the format from the extension and chooses Read (for PDF/text/notebooks) or Bash with an appropriate converter (pandoc, python-docx, openpyxl, unzip, etc.) for Office/OpenDocument. This means adding a new format is one line in document_extensions.

Storage

.uploads/ sits inside APPROVED_DIRECTORY, so Claude's tools reach it without tripping ToolMonitor path boundaries. The file is not deleted after the call — follow-up turns can still read or reference it without re-uploading. Timestamp prefix (YYYYMMDD-HHMMSS-fff) prevents collisions from rapid uploads.

The existing archive/code/text branches are untouched.

Formats supported

Binary (persisted, document branch):

  • .pdf — Read tool (native parser)
  • .docx, .xlsx, .pptx — Bash with pandoc / python-docx / openpyxl
  • .odt, .ods, .odp — Bash with pandoc / LibreOffice
  • .rtf — Bash with pandoc

UTF-8 text-compatible (existing text branch, inline in prompt):

  • .csv, .tsv, .log, .ics, .eml

Changes

  • src/security/validators.py — extend ALLOWED_EXTENSIONS with the two groups above
  • src/bot/features/file_handler.py — new document_extensions set, _detect_file_type returns "document" for them, new _process_document_file persists and builds the generic prompt, handle_document_upload accepts optional current_dir (defaults to approved_directory)
  • src/bot/orchestrator.py (agentic) — resolve current_dir before calling handle_document_upload, pass it through
  • src/bot/handlers/message.py (classic) — same plumbing; .pdf added to supported-formats help text
  • .gitignore — ignore .uploads/
  • Tests — new tests/unit/test_bot/test_file_handler.py (6 tests: parametrized detection for all binary formats, CSV → text branch, unknown binary stays binary, persist + prompt shape, end-to-end PDF, fallback-to-approved-directory); validator whitelist extended + .pdf.exe regression

Known limitations

  • Document conversion relies on whatever tools are available in Claude's execution environment. First attempt may pip install if pandoc/python-docx/openpyxl aren't found. Supports are graceful: Claude will report back if it can't convert a given format.
  • .uploads/ has no automatic cleanup. Manual housekeeping for now — a follow-up PR could add a /new hook or cron-based LRU.
  • .doc (legacy MS Word binary) and .ppt (legacy PowerPoint) intentionally excluded — they're harder to convert reliably and less common today.

Relationship to #192

This PR supersedes #192, which was a narrow patch that branched on .pdf directly in agentic_document. #192 has been closed. This PR:

  • puts the logic in FileHandler where other document types already live
  • gives classic mode document support as a side effect
  • extends the whitelist to the full Office family
  • removes the PDF-specific helper from the orchestrator
  • uses a generic, format-agnostic prompt so adding more formats is trivial

Test plan

  • make test — 542 passed (10 new tests)
  • black/isort/flake8 clean
  • Live smoke PDF: SMOKE-TOKEN-42 uploaded, saved, Claude read via Read and answered correctly
  • Live smoke DOCX: SMOKE-DOCX-99 uploaded as .docx, saved to .uploads/20260424-145608-997-contract.docx, Claude converted via Bash and answered correctly
  • Regression: .exe still blocked, .pdf.exe trap still blocked, files > 10 MB still rejected

🤖 Generated with Claude Code

Add a new `"document"` type to FileHandler for binary document formats.
Matching files are persisted to `<current_dir>/.uploads/<timestamp>-<name>`
and Claude receives a generic prompt with the absolute path, letting the
agent pick whichever tool fits the format (Read for PDF/text; Bash with
pandoc, python-docx, openpyxl, unzip, etc. for Office/OpenDocument).

Covered formats (added to ALLOWED_EXTENSIONS):
- Binary (document branch): .pdf, .docx, .xlsx, .pptx, .odt, .ods, .odp, .rtf
- UTF-8 text-compatible (existing text branch): .csv, .tsv, .log, .ics, .eml

Both agentic (`agentic_document`) and classic (`handle_document`) handlers
pass `current_dir` into `FileHandler.handle_document_upload`, so classic
mode gains document support for free.

The `.uploads/` directory sits inside APPROVED_DIRECTORY, so Claude's
tools can reach it without tool_monitor boundary issues. The file
survives the upload call so follow-up turns can still read or
reference it.

Supersedes RichardAtCT#192 (narrow PDF-only patch that branched directly in
`agentic_document`). This PR routes the same capability through the
proper abstraction and extends it to the whole Office family in one
step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@timurvafin timurvafin force-pushed the feat/unified-document-handler branch from 220496e to 3f32a51 Compare April 24, 2026 14:57
@timurvafin timurvafin changed the title feat: accept PDF uploads via unified FileHandler document branch feat: accept document uploads via unified FileHandler branch Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant