An AI-powered system that allows users to upload documents (PDF, DOCX, TXT, MD) and ask questions using a local LLaMA 3 model via Ollama with Retrieval-Augmented Generation (RAG).
๐ Upload PDF, DOCX, TXT, MD documents
๐ฌ ChatGPT-style conversational interface
๐ง 100% local LLM (no API required)
โก Fast semantic search using FAISS
๐ Source tracking for answers
Built a Retrieval-Augmented Generation (RAG) system using a local LLM
Integrated LLaMA 3 via Ollama for offline inference
Designed prompts to reduce hallucination
Semantic search using FAISS
Transformer embeddings for document understanding
Optimized chunking and retrieval
FastAPI-based backend
Document ingestion pipeline
Error handling and reliability
Streamlit chat interface
Chat history and file upload
Backend-safe UX
Upload โ Parse โ Chunk โ Embed โ Store โ Retrieve โ Generate
Solved race conditions
Handled local model constraints
Built fault-tolerant flow
Clean structure
GitHub-ready setup
Environment-safe practices
High RAM usage
No persistence
Single user
Multi-doc support
Chat memory
Docker
Cloud fallback
Upload document
Chunk & embed
Store in FAISS
Retrieve relevant chunks
LLM generates answer
Frontend: Streamlit
Backend: FastAPI
LLM: LLaMA 3 (via Ollama)
Embeddings: sentence-transformers
Vector DB: FAISS