A demonstration of RAG (Retrieval-Augmented Generation) technology using Spring AI with Ollama and PGVector.
This project demonstrates the implementation of a RAG (Retrieval-Augmented Generation) system that allows users to ask questions about PDF documents and receive contextualized answers using artificial intelligence. The system uses a vector database to store and search for relevant information within the documents.
- Analyzed Document: "La Fortune des Rougon" by Γmile Zola
- Chat Model: Llama 3.1 8B via Ollama
- Embedding Model: nomic-embed-text
- Vector Database: PGVector (PostgreSQL)
- Java 21
- Spring Boot 3.5.3
- Spring AI 1.0.0
- Ollama (for LLM models)
- PostgreSQL with PGVector extension
- Docker Compose (for infrastructure)
spring-ai-starter-model-ollama- Integration with Ollamaspring-ai-pdf-document-reader- PDF reading and processingspring-ai-starter-vector-store-pgvector- Vector databasespring-ai-starter-model-chat-memory- Conversation memory managementspring-ai-advisors-vector-store- Advisors for vector search
- Java 21 or higher
- Maven 3.6+
- Docker and Docker Compose
- Ollama installed locally
# macOS
brew install ollama
# Start the service
ollama serve
# Download required models
ollama pull llama3.1:8b
ollama pull nomic-embed-textgit clone <repository-url>
cd spring-ai-RAG-democd src/main/docker
docker-compose up -dcurl http://localhost:11434/api/tagsmvn clean compile
# First run: fill the vector store and next continue with console interaction
mvn spring-boot:run -Dspring-boot.run.arguments="--fillVectorStore"
# Next runs: start the application with console interaction
mvn spring-boot:runOnce the application is started, you can interact with it via the console:
Ask a question: Who are the main characters of the novel? ?
[The AI responds based on the content of the document...]
Ask a question: exit
- "What is the historical context of the novel?"
- "Describe the main character"
- "What are the main themes of the work?"
- "Summarize the first chapter"
To analyze another PDF document, edit the application.yml file:
rag:
system-prompt: "You are an expert on the provided document and you answer questions based on the information given"
document-path: "classpath:your-document.pdf"spring:
ai:
ollama:
chat:
options:
model: llama3.2:3b # Lower memory model
# model: llama3.1:8b # Default model
temperature: 0.1Ensure the database connection settings in application.yml match your PostgreSQL setup:
spring:
datasource:
url: jdbc:postgresql://localhost:5432/vector_db
username: postgres_user
password: postgres_passwordβββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Document PDF ββββββ Text Splitter ββββββ Embeddings β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β User Question ββββββ Chat Client ββββββ PGVector DB β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βββββββββββββββββββ
β Ollama LLM β
βββββββββββββββββββ
src/
βββ main/
β βββ java/com/zenika/demo/rag/
β β βββ RagDemoApplication.java # Application entry point
β βββ resources/
β β βββ application.yml # Configuration
β β βββ *.pdf # Documents source
β βββ docker/
β βββ compose.yml # Docker Compose for PostgreSQL with PGVector
- Document Ingestion: Reading and splitting PDFs into chunks
- Embeddings Generation: Converting text into vectors
- Vector Storage: Saving into PGVector
- Semantic Search: Retrieving relevant context
- Answer Generation: Using an LLM to respond
-
Ollama service not running
# Check if Ollama is running ollama serve -
Database connection issues
# Restart PostgreSQL with PGVector docker-compose down && docker-compose up -d
-
Models not found
# Download required models ollama pull llama3.1:8b ollama pull nomic-embed-text -
Unsufficient memory for model
- Use a smaller model (
llama3.2:3b) - Increase memory:
-Xmx4g
- Use a smaller model (
mvn spring-boot:run
# Check PostgreSQL logs
docker-compose logs pgvector
# List Ollama models
ollama listThis project is an educational demonstration. Contributions are welcome to:
- Add new document types
- Improve system prompts
- Optimize performance
- Add tests