Skip to content

Resource upload lacks a staging buffer — the fully automatic pipeline has no idempotency guarantee #2560

@xiaoxinova

Description

@xiaoxinova

Resource upload lacks a staging buffer — the fully automatic pipeline has no idempotency guarantee

Version: v0.3.24

Problem

After uploading a resource to OpenViking via any method (WebUI drag-and-drop, REST API, MCP tool, or CLI), the system immediately enters the full async pipeline: Parse → VLM summarization (LLM) → Embedding → Vector store. There is no buffer between "resource is ready" and "token consumption starts" where the user can intervene.

This gap leads to two real-world issues:

Issue 1: Users cannot control token consumption
No matter how the resource is uploaded, LLM + Embedding tasks fire immediately. Users have no chance to verify whether the resource is correct, whether it needs processing, or whether the cost is acceptable. For large files or bulk uploads, there is no way to estimate token cost upfront.

Issue 2: Lack of idempotency causes duplicate resources for API callers (a derivative of Issue 1)
When POST /api/v1/resources is called with the default wait=False, the request returns instantly while actual processing happens asynchronously in the background. API callers (agents, scripts, CI) that don't see immediate results may retry. Each retry creates a fully independent copy (project-docs/project-docs_1/project-docs_2/), each running the full LLM + Embedding pipeline independently. WebUI manual uploads don't hit this particular issue, but they suffer the same lack of token control.

Suggested Design

Insert a controllable staging layer between "upload complete" and "LLM/Embedding pipeline":

Current (all upload methods):
  Upload ──→ [Parse ──→ VLM ──→ Embedding ──→ Vector Store]
                 No intervention possible, no idempotency

Expected:
  Upload ──→ [Staging Area] ──→ [Parse ──→ VLM ──→ Embedding ──→ Vector Store]
               ↑                       ↑
          Controllable,           Controllable
          idempotent              (pause/resume/cancel)

Required Capabilities

Capability Description
Staging & trigger separation After upload, resource enters staged state, NOT LLM/Embedding. Two trigger modes: ① configurable pipeline.auto_trigger_seconds countdown (e.g., 30s), or ② explicit user/caller trigger
Idempotency (API scenario) Repeating the same source + scope request within staged/processing returns the same task_id + root_uri without creating a duplicate
Per-stage control Users can pause/resume/cancel tasks at any stage before tokens are consumed
Queryable state Clear state machine: staged → processing → completed / failed, visible across all upload methods
WebUI resource management List view with resource status, batch trigger/delete for staged resources, queue backlog visibility
Token estimation Show estimated cost (based on parsed text volume) before triggering LLM processing; require user confirmation

Backward Compatibility

Add a config option pipeline.auto_trigger_seconds. Default 0 preserves existing behavior (process on upload). Set to >0 to enable staging and idempotency. Existing integrations are unaffected.

Acceptance Criteria

  • WebUI upload: resource shows as staged, does NOT trigger LLM/Embedding
  • API POST /api/v1/resources upload: resource shows as staged
  • Countdown expiry or manual trigger moves resource to processing
  • staged or processing resources can be cancelled/deleted
  • Same resource re-uploaded within staged/processing period does NOT create a duplicate copy (API scenario)
  • Pipeline can be paused/resumed (queue-level control)
  • Token estimate shown before LLM processing begins
  • Default behavior unchanged (auto_trigger_seconds=0)

Affected Code

File Current State
routers/resources.py / web_studio/ All upload entry points directly enter processing, no staging
resource_processor.py:reserve_unique_candidate() Allocates new URI on name conflict, never checks for duplicate content
routers/tasks.py Read-only (GET only), no control endpoints
task_tracker.py State machine only supports create/start/complete/fail; no staged/cancel/pause/resume
embedding_queue.py No deduplication
semantic_queue.py 45s dedup window only for context_type == "memory", not applicable to resources

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions