vector-volume-discovery-pipeline

The Vector Volume Discovery Pipeline is a multimodal retrieval system for searching document images using dense vector similarity. It encodes each page into ColBERT-style multi-vector embeddings and stores them in Qdrant for fast, semantic search. The system supports both image and text queries and routes top-K results through vision-language models for context-aware generation. This project also benchmarks four similarity functions - cosine, dot product, euclidean, and manhattan - to evaluate retrieval quality across real-world textbook queries. This repository includes code to host the models, generate embeddings, and run a backend FastAPI server for retrieval and generation tasks.

Components

colpali_standalone/ – Scripts to host ColPali model and embedding functions
llm_vision_models/ – Hosting scripts for LLaMA 3.2 Vision and Paligemma for response generation
backend/ – FastAPI server that connects to all models and Qdrant for inference, search, and generation
Colpali_Image_Embeddings_v1.ipynb – Colab/Notebook version for testing embedding workflows
evaluation_scores.ipynb – Computes Precision@K, Recall@K, F1@K, AvgPrecision, and MRR for multiple similarity functions

Hosting the Models

Prerequisites

Python 3.8+
Pip
Qdrant running locally or remotely
Access to a GPU (recommended) for model inference

1. Clone the Repository

git clone <repository-url>
cd vector-volume-discovery-pipeline

2. Install Dependencies

pip install -r requirements.txt

3. Host Each Model

Navigate to the respective directories and run the hosting scripts.

i. ColPali Model (Image Embeddings)

cd colpali_standalone
python -m uvicorn colpali_host_script:app --host 0.0.0.0 --port 8000 --reload

ii. LLaMA 3.2 Vision

cd llm_vision_models
python -m uvicorn llama_3_2_vision_host_script:app --host 0.0.0.0 --port 8000 --reload

iii. Paligemma

cd llm_vision_models
python -m uvicorn paligemma_host_script:app --host 0.0.0.0 --port 8000 --reload

Backend – FastAPI Server

The backend module powers inference workflows by connecting to the hosted models and Qdrant.

Structure

backend/
├── api_router.py               # Central FastAPI router for model & Qdrant interaction
├── colpali_client.py           # API client to interact with hosted ColPali model
├── llama_client.py             # API client for LLaMA 3.2 Vision
├── paligemma_client.py         # API client for Paligemma model
├── qdrant_client.py            # Functions to upsert, search, and manage vectors in Qdrant
├── utils.py                    # Image preprocessing, base64 conversion, and helper functions
└── app.py                      # FastAPI entrypoint for backend server

Run the Backend Server

cd backend
uvicorn main:app --host 0.0.0.0 --port 8080 --reload

API Endpoints

`POST /embed`

Sends a document image to ColPali
Returns multi-vector embedding and optionally inserts into Qdrant

`POST /search`

Accepts a query vector or image
Returns top-K most relevant document pages from Qdrant

Environment Variables

Optional .env file configuration:

COLPALI_API_URL=http://localhost:8000
LLAMA_API_URL=http://localhost:8000
PALIGEMMA_API_URL=http://localhost:8000
QDRANT_URL=http://localhost:6333
QDRANT_COLLECTION=vector_volume_pages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vector-volume-discovery-pipeline

Components

Hosting the Models

Prerequisites

1. Clone the Repository

2. Install Dependencies

3. Host Each Model

i. ColPali Model (Image Embeddings)

ii. LLaMA 3.2 Vision

iii. Paligemma

Backend – FastAPI Server

Structure

Run the Backend Server

API Endpoints

`POST /embed`

`POST /search`

Environment Variables

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
backend		backend
colpali_standalone		colpali_standalone
llm_vision_models		llm_vision_models
Colpali_Image_Embeddings_v1.ipynb		Colpali_Image_Embeddings_v1.ipynb
LICENSE		LICENSE
README.md		README.md
evaluation_scores.ipynb		evaluation_scores.ipynb

License

Data-to-Insight-Center/vector-volume-discovery-pipeline

Folders and files

Latest commit

History

Repository files navigation

vector-volume-discovery-pipeline

Components

Hosting the Models

Prerequisites

1. Clone the Repository

2. Install Dependencies

3. Host Each Model

i. ColPali Model (Image Embeddings)

ii. LLaMA 3.2 Vision

iii. Paligemma

Backend – FastAPI Server

Structure

Run the Backend Server

API Endpoints

POST /embed

POST /search

Environment Variables

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`POST /embed`

`POST /search`

Packages