A generic RAG (Retrieval Augmented Generation) API using RushDB for record vectorization and vector search capabilities.
- Generic Record Processing: Index any text field from any record type in RushDB
- Vector Embeddings: Use sentence transformers to create embeddings for semantic search
- RushDB Integration: Add embedding properties directly to existing records
- Vector Search: Search for relevant records using cosine similarity
- FastAPI Interface: RESTful API for easy integration
- Auto-Configuration: Automatic initialization from environment variables
This project uses UV for dependency management. Make sure you have UV installed:
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repository and navigate to the project
cd python-books-rag
# Install dependencies
uv sync- Copy the example environment file:
cp .env.example .env- Edit
.envand add your RushDB API token:
# Get your API token from https://app.rushdb.com/
RUSHDB_API_TOKEN=your_actual_token_here- (Optional) Customize other settings in
.env:
EMBEDDING_MODEL=all-MiniLM-L6-v2- Run the application:
uv run python run_app.py- Or start the API server directly:
uv run uvicorn src.api:app --host 0.0.0.0 --port 8000 --reloadThe application will automatically initialize from your .env configuration. The API will be available at http://localhost:8000 with interactive docs at http://localhost:8000/docs.
# Navigate to the project directory
cd /path/to/project
# Install dependencies with UV
uv syncYou'll need a RushDB API token. You can get one from:
- RushDB Cloud Dashboard (for cloud instance)
- Your self-hosted RushDB instance
The application provides a RESTful API for record indexing and search. All configuration is handled through environment variables - no manual initialization required.
- Check API status and configuration:
curl http://localhost:8000/- Health check:
curl http://localhost:8000/health- Index records:
curl -X POST "http://localhost:8000/index" \
-H "Content-Type: application/json" \
-d '{
"labels": ["Article"],
"field": "content",
"vector_dimension": 384
}'You can also use more complex search queries for indexing:
curl -X POST "http://localhost:8000/index" \
-H "Content-Type: application/json" \
-d '{
"labels": ["Article"],
"where": {"category": "technology"},
"field": "content",
"vector_dimension": 384,
"limit": 500
}'- Search records (basic search):
curl -X POST "http://localhost:8000/search" \
-H "Content-Type: application/json" \
-d '{
"labels": ["Article"],
"query": "What is RushDB?",
"limit": 5
}'- Advanced search with filtering:
curl -X POST "http://localhost:8000/search" \
-H "Content-Type: application/json" \
-d '{
"labels": ["Article"],
"query": "What is RushDB?",
"limit": 5,
"vector_dimension": 384,
"min_score": 0.7,
"offset": 0
}'All endpoints return JSON responses. The API automatically initializes from your .env configuration on startup.
The application adds embedding properties directly to existing records:
Record (e.g., Article)
{
"title": "Sample Article",
"content": "This is the article content...",
"embedding": [0.1, 0.2, 0.3, ...],
// ... other properties
}
- Record Selection: Records are retrieved from RushDB using the provided search query
- Content Extraction: Text from the specified field is extracted
- Vectorization: The content is converted to a vector embedding using sentence transformers
- Storage: The embedding is added as a property to the existing record
- Search: Vector similarity search is performed directly on the records
The application uses RushDB's powerful vector search capabilities with the following features:
- Label-based filtering: Target specific record types
- Vector similarity: Calculate cosine similarity between query and stored embeddings
- Minimum score threshold: Filter out low-relevance results (optional)
- Sorting: Order results by similarity score
- Pagination: Control the number of results returned
Search parameters:
labels: Labels of records to searchquery: Text query to find similar contentlimit: Maximum number of results to returnmin_score: Minimum similarity threshold (0-1)offset: Number of results to skip (for pagination)vector_dimension: Control embedding size/quality tradeoff
# Basic vector search query
results = db.records.find({
"labels": ["Article"],
"aggregate": {
"score": {
"alias": "$record",
"field": "embedding",
"fn": "gds.similarity.cosine",
"query": query_vector
}
},
"orderBy": { "score": "desc" },
"limit": limit
})src/rag_engine.py: Core RAG implementation with text processing and RushDB operationssrc/api.py: FastAPI application with REST endpointssrc/config.py: Configuration management and environment variable handlingrun_app.py: Application runner with testing and server startuppyproject.toml: Project configuration and dependencies
- TextProcessor: Handles text vectorization
- RagService: Manages RushDB operations for indexing and search
- FastAPI App: RESTful API with automatic configuration from environment
- Embedding Model: Change the
EMBEDDING_MODELin.envto use different sentence transformer models - Vector Dimensions: Use the
vector_dimensionparameter in API requests to specify the embedding dimension:- 384: Uses the all-MiniLM-L6-v2 model (faster, smaller embeddings)
- 768: Uses the all-mpnet-base-v2 model (slower, more accurate embeddings)
- Search Configuration: Modify similarity scoring in the search aggregation
- Record Selection: Specify different record labels and fields to process
fastapi: Web framework for the APIrushdb: RushDB Python SDKsentence-transformers: For text embeddingsuvicorn: ASGI serverpydantic: Data validation