Resource upload lacks a staging buffer — the fully automatic pipeline has no idempotency guarantee
Version: v0.3.24
Problem
After uploading a resource to OpenViking via any method (WebUI drag-and-drop, REST API, MCP tool, or CLI), the system immediately enters the full async pipeline: Parse → VLM summarization (LLM) → Embedding → Vector store. There is no buffer between "resource is ready" and "token consumption starts" where the user can intervene.
This gap leads to two real-world issues:
Issue 1: Users cannot control token consumption
No matter how the resource is uploaded, LLM + Embedding tasks fire immediately. Users have no chance to verify whether the resource is correct, whether it needs processing, or whether the cost is acceptable. For large files or bulk uploads, there is no way to estimate token cost upfront.
Issue 2: Lack of idempotency causes duplicate resources for API callers (a derivative of Issue 1)
When POST /api/v1/resources is called with the default wait=False, the request returns instantly while actual processing happens asynchronously in the background. API callers (agents, scripts, CI) that don't see immediate results may retry. Each retry creates a fully independent copy (project-docs/ → project-docs_1/ → project-docs_2/), each running the full LLM + Embedding pipeline independently. WebUI manual uploads don't hit this particular issue, but they suffer the same lack of token control.
Suggested Design
Insert a controllable staging layer between "upload complete" and "LLM/Embedding pipeline":
Current (all upload methods):
Upload ──→ [Parse ──→ VLM ──→ Embedding ──→ Vector Store]
No intervention possible, no idempotency
Expected:
Upload ──→ [Staging Area] ──→ [Parse ──→ VLM ──→ Embedding ──→ Vector Store]
↑ ↑
Controllable, Controllable
idempotent (pause/resume/cancel)
Required Capabilities
| Capability |
Description |
| Staging & trigger separation |
After upload, resource enters staged state, NOT LLM/Embedding. Two trigger modes: ① configurable pipeline.auto_trigger_seconds countdown (e.g., 30s), or ② explicit user/caller trigger |
| Idempotency (API scenario) |
Repeating the same source + scope request within staged/processing returns the same task_id + root_uri without creating a duplicate |
| Per-stage control |
Users can pause/resume/cancel tasks at any stage before tokens are consumed |
| Queryable state |
Clear state machine: staged → processing → completed / failed, visible across all upload methods |
| WebUI resource management |
List view with resource status, batch trigger/delete for staged resources, queue backlog visibility |
| Token estimation |
Show estimated cost (based on parsed text volume) before triggering LLM processing; require user confirmation |
Backward Compatibility
Add a config option pipeline.auto_trigger_seconds. Default 0 preserves existing behavior (process on upload). Set to >0 to enable staging and idempotency. Existing integrations are unaffected.
Acceptance Criteria
Affected Code
| File |
Current State |
routers/resources.py / web_studio/ |
All upload entry points directly enter processing, no staging |
resource_processor.py:reserve_unique_candidate() |
Allocates new URI on name conflict, never checks for duplicate content |
routers/tasks.py |
Read-only (GET only), no control endpoints |
task_tracker.py |
State machine only supports create/start/complete/fail; no staged/cancel/pause/resume |
embedding_queue.py |
No deduplication |
semantic_queue.py |
45s dedup window only for context_type == "memory", not applicable to resources |
Resource upload lacks a staging buffer — the fully automatic pipeline has no idempotency guarantee
Version: v0.3.24
Problem
After uploading a resource to OpenViking via any method (WebUI drag-and-drop, REST API, MCP tool, or CLI), the system immediately enters the full async pipeline: Parse → VLM summarization (LLM) → Embedding → Vector store. There is no buffer between "resource is ready" and "token consumption starts" where the user can intervene.
This gap leads to two real-world issues:
Issue 1: Users cannot control token consumption
No matter how the resource is uploaded, LLM + Embedding tasks fire immediately. Users have no chance to verify whether the resource is correct, whether it needs processing, or whether the cost is acceptable. For large files or bulk uploads, there is no way to estimate token cost upfront.
Issue 2: Lack of idempotency causes duplicate resources for API callers (a derivative of Issue 1)
When
POST /api/v1/resourcesis called with the defaultwait=False, the request returns instantly while actual processing happens asynchronously in the background. API callers (agents, scripts, CI) that don't see immediate results may retry. Each retry creates a fully independent copy (project-docs/→project-docs_1/→project-docs_2/), each running the full LLM + Embedding pipeline independently. WebUI manual uploads don't hit this particular issue, but they suffer the same lack of token control.Suggested Design
Insert a controllable staging layer between "upload complete" and "LLM/Embedding pipeline":
Required Capabilities
stagedstate, NOT LLM/Embedding. Two trigger modes: ① configurablepipeline.auto_trigger_secondscountdown (e.g., 30s), or ② explicit user/caller triggersource+scoperequest withinstaged/processingreturns the sametask_id+root_uriwithout creating a duplicatestaged → processing → completed / failed, visible across all upload methodsBackward Compatibility
Add a config option
pipeline.auto_trigger_seconds. Default0preserves existing behavior (process on upload). Set to>0to enable staging and idempotency. Existing integrations are unaffected.Acceptance Criteria
staged, does NOT trigger LLM/EmbeddingPOST /api/v1/resourcesupload: resource shows asstagedprocessingstagedorprocessingresources can be cancelled/deletedstaged/processingperiod does NOT create a duplicate copy (API scenario)auto_trigger_seconds=0)Affected Code
routers/resources.py/web_studio/resource_processor.py:reserve_unique_candidate()routers/tasks.pytask_tracker.pycreate/start/complete/fail; nostaged/cancel/pause/resumeembedding_queue.pysemantic_queue.pycontext_type == "memory", not applicable to resources