Skip to content

pradhanhitesh/RAGifyResearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAGifyResearch

RAGifyResearch is a simple, fully local Retrieval-Augmented Generation (RAG) system designed to help interns in our research lab learn more about ongoing research. It enables users to extract and query research papers, abstracts, and documents using CLI.

Features

✅ Supports PDFs as a knowledge source
✅ Extracts and chunks text
✅ Stores data in ChromaDB for retrieval
✅ Enables local chatbot interaction
✅ Uses LM Studio with Meta Llama 3.1 7B-Instruct
✅ Uses Jina Embeddings v2 for vector storage
✅ Uses Discord Bot for easy user-friendly interaction

Planned Features

🚀 Support for additional models

Installation

  1. Clone the Repository

    git clone https://github.com/yourusername/RAGifyResearch.git  
    cd RAGifyResearch  
  2. Create virtual enviroment

    python3.8 -m venv venv
    source venv/bin/activate 
  3. Install Dependencies

    pip install -r requirements.txt  
  4. Run the System: First Time

    python main.py --create-db path/to/pdf/dir db_name
  5. Run the System: Any Time

    python main.py --load-db db_name

Usage

  1. Add research papers (PDFs) to the designated folder.
  2. The system will extract, chunk, and store the text in ChromaDB.
  3. Use the CLI to query the documents and get relevant information.
  4. To exit the chat, use exit.

Contributing

Contributions are welcome! Feel free to submit a pull request or open an issue.

About

RAGifyResearch is a simple, local Retrieval-Augmented Generation (RAG) system designed to process research article PDFs, extract and store text in ChromaDB, and enables chatbot interaction using LM Studio with Meta Llama 3.1 7B-Instruct and Jina Embeddings v2.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages