fix: buffer Telegram photo-album messages into a single Claude request by IliyaBrook · Pull Request #188 · RichardAtCT/claude-code-telegram

IliyaBrook · 2026-04-18T15:59:14Z

Follow-up to the discussion in #186 — @MatveyF flagged that the same
"one logical message, N responses" problem happens with images:

if I attach images to my message I get a response per image.
E.g. if I send a piece of text with 3 images it will respond 3 times —
one for the text and the first image, second for the second image,
and third for the third image

Root cause

When a user sends a Telegram album (text + N photos), Telegram delivers
it as N separate Updates that share a common message.media_group_id;
the caption lives on only one of them. The agentic photo handler fires
once per Update, so Claude is invoked N times independently instead of
seeing the album as a single message.

Fix

New MediaGroupBuffer (src/bot/utils/media_group_buffer.py) —
debounces photos keyed by (user_id, chat_id, thread_id, media_group_id).
Any photo carrying a media_group_id is buffered; a short timer
(default 1.0s, configurable via MEDIA_GROUP_BUFFER_TIMEOUT, range
0.3 – 5.0) fires after the last photo and flushes all photos +
caption as a single payload.
agentic_photo refactored into a dispatcher: standalone photos take
the existing fast path; album photos go through the buffer.
Shared work extracted to _process_photo_batch — the first photo is
processed with the caption (so the prompt template keeps user intent),
the rest are processed for image data only, and a single
claude_integration.run_command is issued with all images attached.
The Stop button cancels any pending media-group buffers for the user.

…oup batching Three-piece hybrid that preserves upstream's native multimodal SDK content blocks while adding Cortex-specific Obsidian vault integration. ## 1. .media.telegram/ persistence layer New MediaArchive helper saves every Telegram-uploaded image, document and voice file to a vault-relative archive directory before forwarding to Claude: .media.telegram/images/<chat_id>_<message_id>.<ext> .media.telegram/pdfs/<chat_id>_<message_id>_<original_name> .media.telegram/documents/<chat_id>_<message_id>_<original_name> .media.telegram/audios/<timestamp>/{received,sent}.{ogg,txt} Image bytes still flow into the SDK as base64 content blocks (upstream's v1.6.0 native multimodal behaviour), but the prompt is augmented with the saved path so Claude can reference the file via Obsidian ``![[name]]`` wiki-links if a note is being written. Voice handler now takes an optional media_archive and persists received audio + transcript into a fresh paired-audio dir whose path it returns on ProcessedVoice; the TTS reply later writes its sent.* peer into the same dir, keeping voice exchanges grouped on disk. Settings: MEDIA_ARCHIVE_ENABLED, MEDIA_ARCHIVE_DIR. ## 2. Post-turn ![[...]] reference detector → 5-Attachments promotion After every Claude turn (text, document, photo, voice paths), the orchestrator collects .md file paths Claude touched via Edit/Write/ MultiEdit calls and scans them for embed references (``![[name]]``). Any match whose target lives in the archive is copied into 5-Attachments/<type>/YYYY-MM/<name> Both copies remain: the archive copy stays as the raw record (gitignored), the attachments copy becomes what Obsidian renders. Copy is content-hash idempotent — re-running on the same note never duplicates work. Type subdirs map archive layout (images / pdfs / documents / audios) with a fallback by file extension. Settings: ATTACHMENT_PROMOTE_ENABLED, ATTACHMENT_DIR. ## 3. Media-group batching Telegram delivers a multi-file selection (media group) as N separate Update events sharing the same media_group_id. Without batching, the default agentic handlers fire one Claude session and one reply per item — N "Working..." messages racing on the same session id. New MediaGroupBuffer keyed on (chat_id, media_group_id) downloads each item via the existing flow, debounces for TELEGRAM_MEDIA_GROUP_WINDOW_SECONDS (default 2.5s), then runs Claude once with a combined prompt listing every saved path and replies once. Single-file uploads (media_group_id is None) keep the existing fast path. The buffer stays UI-agnostic — orchestrator supplies the flush callback that builds SDK content blocks and dispatches through ``_handle_agentic_media_message``. Settings: MEDIA_GROUP_WINDOW_SECONDS, MEDIA_GROUP_MAX_FILES. ## Upstream-PR posture Pieces 1 and 3 are generic enough to upstream against RichardAtCT/claude-code-telegram (PR RichardAtCT#188 already proposes media-group batching with a different shape). Piece 2 is Cortex-specific because it assumes an Obsidian vault layout — stays local. ## Tests - tests/unit/test_bot/test_media_archive.py — save_image / save_document / pair_dir uniqueness / sanitize / transcript optional - tests/unit/test_bot/test_attachment_promoter.py — collect_modified_md_paths filters / multi-input-key support / promote happy path / disabled / idempotent / pdf kind - tests/unit/test_media_group_buffer.py — buffer happy path, debounce, cap, cancellation

fix: buffer Telegram photo-album messages into a single Claude request

c585a36

IliyaBrook mentioned this pull request Apr 18, 2026

Bug: Concat messages that were chunked by Telegram #186

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: buffer Telegram photo-album messages into a single Claude request#188

fix: buffer Telegram photo-album messages into a single Claude request#188
IliyaBrook wants to merge 1 commit intoRichardAtCT:mainfrom
IliyaBrook:fix/186-buffer-photo-album

IliyaBrook commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

IliyaBrook commented Apr 18, 2026

Root cause

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant