Skip to content

Default semantic_embedding_cache_dir to a persistent platform-aware path #681

@cderv

Description

@cderv

Problem

semantic_embedding_cache_dir defaults to None, which means fastembed falls back to its own default: a subdirectory of the system temp folder (%TEMP%\fastembed_cache on Windows, /tmp/fastembed_cache on Linux). Temp directories are not persistent — OS-level cleanup can corrupt or remove the cached ONNX model, breaking semantic search until the user manually clears and re-downloads.

The fastembed cache uses symlinks internally (snapshot files pointing to blobs), which makes partial temp cleanup particularly destructive — the cache directory can survive while the model inside it becomes unloadable.

This is tracked upstream as qdrant/fastembed#569.

Suggestion

basic-memory could default semantic_embedding_cache_dir to a platform-appropriate persistent path instead of None:

  • Windows: %LOCALAPPDATA%/fastembed_cache
  • Linux/macOS: ~/.cache/fastembed

Something like platformdirs.user_cache_dir("fastembed") would handle this cross-platform. This would insulate basic-memory users from fastembed's temp-dir default regardless of when (or if) the upstream issue is fixed.

fastembed also supports a FASTEMBED_CACHE_PATH environment variable as an override, so users who set that would still be covered.

Environment

  • OS: Windows 11 Pro
  • Basic Memory version: 0.19.2
  • Installation method: uv tool

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions