RAGifyResearch is a simple, fully local Retrieval-Augmented Generation (RAG) system designed to help interns in our research lab learn more about ongoing research. It enables users to extract and query research papers, abstracts, and documents using CLI.
✅ Supports PDFs as a knowledge source
✅ Extracts and chunks text
✅ Stores data in ChromaDB for retrieval
✅ Enables local chatbot interaction
✅ Uses LM Studio with Meta Llama 3.1 7B-Instruct
✅ Uses Jina Embeddings v2 for vector storage
✅ Uses Discord Bot for easy user-friendly interaction
🚀 Support for additional models
-
Clone the Repository
git clone https://github.com/yourusername/RAGifyResearch.git cd RAGifyResearch -
Create virtual enviroment
python3.8 -m venv venv source venv/bin/activate -
Install Dependencies
pip install -r requirements.txt
-
Run the System: First Time
python main.py --create-db path/to/pdf/dir db_name
-
Run the System: Any Time
python main.py --load-db db_name
- Add research papers (PDFs) to the designated folder.
- The system will extract, chunk, and store the text in ChromaDB.
- Use the CLI to query the documents and get relevant information.
- To exit the chat, use
exit.
Contributions are welcome! Feel free to submit a pull request or open an issue.