Skip to content

feat: accept PDF uploads in agentic mode#192

Closed
timurvafin wants to merge 1 commit intoRichardAtCT:mainfrom
timurvafin:feat/accept-pdf-uploads
Closed

feat: accept PDF uploads in agentic mode#192
timurvafin wants to merge 1 commit intoRichardAtCT:mainfrom
timurvafin:feat/accept-pdf-uploads

Conversation

@timurvafin
Copy link
Copy Markdown

Summary

Add .pdf to the allowed-extensions whitelist so the security middleware lets PDF files through. In agentic_document, branch early for PDFs: save the file to <approved_directory>/.uploads/<timestamp>-<name> and pass Claude an absolute path plus an instruction to use the Read tool. Claude reads PDFs natively via its built-in Read tool.

Motivation

Users reported that sending a PDF (ticket, receipt, invoice) to the bot produced 🛡️ File Upload Blocked — File type not allowed: .pdf. Adding only the whitelist entry wasn't enough — the downstream handler then tries to UTF-8 decode the bytes and fails with Unsupported file format. This PR makes both paths work for PDFs.

Changes

  • src/security/validators.py: add .pdf to ALLOWED_EXTENSIONS
  • src/bot/orchestrator.py: new _save_pdf_and_build_prompt helper + early branch in agentic_document for PDFs
  • .gitignore: ignore .uploads/
  • Tests: whitelist coverage, .pdf.exe regression, full agentic_document PDF flow (tmp_path based)

Known limitations

  • Classic mode (AGENTIC_MODE=false) is intentionally untouched. PDFs there would still hit the UTF-8 decode branch. Can be addressed in a follow-up PR.
  • .uploads/ has no automatic cleanup. Manual housekeeping for now — a separate PR could add a /new hook or cron-based LRU.
  • Only .pdf is added. .docx/.xlsx require different UX (Bash + converters) and are out of scope here.

Test plan

  • make test — 532 passed
  • black, isort, flake8 clean on changed files
  • Live smoke on a local bot: sent a PDF containing SMOKE-TOKEN-42, verified (a) no block message, (b) file saved to .uploads/<timestamp>-ticket.pdf with matching size, (c) Claude's response included the token via Read tool
  • Regression unit tests: .exe still blocked, files > 10 MB still rejected, .pdf.exe trap name still blocked

🤖 Generated with Claude Code

Add `.pdf` to the allowed-extensions whitelist so the security middleware
lets PDF files through. In `agentic_document`, branch early for PDFs:
save the file to `<approved_directory>/.uploads/<timestamp>-<name>` and
pass Claude an absolute path plus an instruction to use the Read tool.

Classic mode is intentionally untouched — PDFs in that path would still
hit the UTF-8 decode branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@timurvafin
Copy link
Copy Markdown
Author

Superseded by #193, which implements the same capability through the proper FileHandler abstraction. #193:

  • routes PDFs through the existing FileHandler (new "document" type) instead of a PDF-specific branch in agentic_document
  • gives classic mode PDF support for free
  • opens a clean extension point for future binary formats

Please review #193 instead.

@timurvafin timurvafin closed this Apr 24, 2026
timurvafin added a commit to timurvafin/claude-code-telegram that referenced this pull request Apr 24, 2026
Add a new `"document"` type to FileHandler for binary document formats.
Matching files are persisted to `<current_dir>/.uploads/<timestamp>-<name>`
and Claude receives a generic prompt with the absolute path, letting the
agent pick whichever tool fits the format (Read for PDF/text; Bash with
pandoc, python-docx, openpyxl, unzip, etc. for Office/OpenDocument).

Covered formats (added to ALLOWED_EXTENSIONS):
- Binary (document branch): .pdf, .docx, .xlsx, .pptx, .odt, .ods, .odp, .rtf
- UTF-8 text-compatible (existing text branch): .csv, .tsv, .log, .ics, .eml

Both agentic (`agentic_document`) and classic (`handle_document`) handlers
pass `current_dir` into `FileHandler.handle_document_upload`, so classic
mode gains document support for free.

The `.uploads/` directory sits inside APPROVED_DIRECTORY, so Claude's
tools can reach it without tool_monitor boundary issues. The file
survives the upload call so follow-up turns can still read or
reference it.

Supersedes RichardAtCT#192 (narrow PDF-only patch that branched directly in
`agentic_document`). This PR routes the same capability through the
proper abstraction and extends it to the whole Office family in one
step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant