I wanted to understand how a Retrieval-Augmented Generation (RAG) application works, so I built this chatbot from scratch to answer questions using resources from my knowledge repo.

This is a Node.js + Python + React.js project with two versions:

  • Local version: uses Qdrant as the vector database, Nomic for text embeddings (via vLLM), and a LLaMA-based model running locally with GPT4All.
  • API-based version: uses OpenAI embeddings and queries GPT via OpenAI API.

Things I’d like to improve (when time permits):

  • Add a GitHub pipeline to automatically update the vector store whenever I update the knowledge repo.
  • Deploy the local version to the cloud (though that might cost me a bit ๐Ÿ˜‚).