Skip to content

smalik21/doc-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Document Analyzer

An AI-powered multi-space document analysis platform where users can upload documents, ask questions about them, and receive context-aware answers with source citations.

The system uses Retrieval-Augmented Generation (RAG) with vector search to provide answers strictly from uploaded documents.


Features

Spaces (Chat Workspaces)

  • Create multiple independent spaces
  • Rename spaces
  • Delete spaces
  • Track last activity per space
  • Strict context isolation between spaces

Document Management

  • Upload PDF documents

  • Multiple documents per space

  • Delete documents

  • Processing status tracking:

    • processing
    • ready
    • failed

AI Chat

  • Ask questions about uploaded documents
  • Context-aware answers
  • Conversation history support
  • Source citations included in responses
  • Snippet + page references for retrieved chunks

Vector Search

  • Automatic document chunking
  • Embedding generation
  • Semantic retrieval using Qdrant
  • Per-space retrieval isolation

Architecture

Frontend (React)
        ↓
Backend Service (Node.js + Express + PostgreSQL)
        ↓
AI Service (FastAPI)
        ↓
Qdrant Vector Database

Tech Stack

Frontend

  • React
  • TypeScript
  • Vite
  • LESS

Backend

  • Node.js
  • Express
  • PostgreSQL
  • TypeScript

AI Service

  • FastAPI
  • Sentence Transformers
  • Google Gemini API / Ollama (configurable)
  • LangChain-style RAG flow

Infrastructure

  • Docker
  • Docker Compose
  • Qdrant

Current Capabilities

  • Multi-space document analysis
  • Context-aware AI chat
  • PDF ingestion
  • Vector search
  • Source citations
  • Space/document CRUD operations
  • Shared storage between services
  • Dockerized development environment

Project Structure

root/
├── backend/
├── ai-service/
├── frontend/
├── docker-compose.yml

Local Development

Requirements

  • Docker
  • Docker Compose

Start the Project

docker compose up --build

Services

Service Port
Frontend 5173
Backend 3000
AI Service 8000
Qdrant 6333
PostgreSQL 5432

Environment Variables

Create .env files for backend and AI service.

Example:

GEMINI_API_KEY=
LLM_PROVIDER=gemini
GEMINI_MODEL=gemma-4-31b-it

Do not commit real secrets.


API Overview

Spaces

  • Create space
  • Get spaces
  • Update space name
  • Delete space

Documents

  • Upload document
  • Get documents for a space
  • Delete document

Chat

  • Ask questions
  • Retrieve sources
  • Persist conversation history

Notes

This project is currently under active development.

The focus is on building a scalable and production-oriented AI document analysis system with clean architecture and strong separation between application logic and AI processing.


License

MIT

About

AI-powered multi-space document analysis platform

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors