Bug Description
When OpenViking is deployed with multiple instances behind a load balancer, account and user information created through the admin API on one instance is not propagated to other instances. Subsequent requests routed to a different instance fail authentication because
the in-memory cache of that instance does not contain the newly created account/user.
Steps to Reproduce
- Deploy two OpenViking instances (Instance A and Instance B) behind a round-robin load balancer sharing the same AGFS/VikingFS storage
- Call POST /api/v1/admin/accounts or POST /api/v1/admin/accounts/{account_id}/users — the request lands on Instance A
- Instance A writes the new account/user to AGFS and updates its own in-memory APIKeyManager._accounts
- Send an authenticated request using the new API key — the request is routed to Instance B
- Instance B has no knowledge of the new account/user (its cache was loaded at startup and never refreshed)
Expected Behavior
All instances share a consistent view of accounts and users. A newly created account/user should be immediately usable regardless of which instance handles the next request.
Actual Behavior
Instance B returns an authentication error (401) for the new API key because APIKeyManager._accounts on that instance is stale. The account/user only exists in AGFS (persisted) and in Instance A's memory.
Minimal Reproducible Example
# Instance A: create a new user
import requests
resp = requests.post("http://instance-a/api/v1/admin/accounts/acct1/users", ...)
api_key = resp.json()["api_key"]
# Next request hits Instance B via load balancer
resp = requests.get("http://load-balancer/api/v1/...", headers={"Authorization": f"Bearer {api_key}"})
# → 401 Unauthorized
Error Logs
# Instance B logs
AuthenticationError: API key not found
OpenViking Version
v0.3.22
Python Version
3.12
Operating System
Linux
Model Backend
None
Additional Context
Three failure modes exist:
┌─────────────────────┬──────────────────────────────────────────────────┐
│ Operation │ Effect on other instances │
├─────────────────────┼──────────────────────────────────────────────────┤
│ Create account/user │ New credentials rejected (401) │
├─────────────────────┼──────────────────────────────────────────────────┤
│ Regenerate API key │ Old key still accepted (security risk) │
├─────────────────────┼──────────────────────────────────────────────────┤
│ Delete user │ Deleted user still authenticated (security risk) │
└─────────────────────┴──────────────────────────────────────────────────┘
Workaround: Route all admin API requests to a single designated instance, and restart other instances after any account change.
Bug Description
When OpenViking is deployed with multiple instances behind a load balancer, account and user information created through the admin API on one instance is not propagated to other instances. Subsequent requests routed to a different instance fail authentication because
the in-memory cache of that instance does not contain the newly created account/user.
Steps to Reproduce
Expected Behavior
All instances share a consistent view of accounts and users. A newly created account/user should be immediately usable regardless of which instance handles the next request.
Actual Behavior
Instance B returns an authentication error (401) for the new API key because APIKeyManager._accounts on that instance is stale. The account/user only exists in AGFS (persisted) and in Instance A's memory.
Minimal Reproducible Example
Error Logs
# Instance B logs AuthenticationError: API key not foundOpenViking Version
v0.3.22
Python Version
3.12
Operating System
Linux
Model Backend
None
Additional Context
Three failure modes exist:
┌─────────────────────┬──────────────────────────────────────────────────┐
│ Operation │ Effect on other instances │
├─────────────────────┼──────────────────────────────────────────────────┤
│ Create account/user │ New credentials rejected (401) │
├─────────────────────┼──────────────────────────────────────────────────┤
│ Regenerate API key │ Old key still accepted (security risk) │
├─────────────────────┼──────────────────────────────────────────────────┤
│ Delete user │ Deleted user still authenticated (security risk) │
└─────────────────────┴──────────────────────────────────────────────────┘
Workaround: Route all admin API requests to a single designated instance, and restart other instances after any account change.