Ollama RAG
An “Ollama RAG app” is a web service that uses Ollama and a Vector DB to provide “Retrieval-Augmented Generation”. Here we set up a local LLM instance with ollama and chroma db for result augmentation.
Rough Notes from the initial test, mostly based from hackernoon.
# Check that we have Video Card Support
lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M1000M] (rev a2)
# Verify "Quadro" supports compute 5 at https://developer.nvidia.com/cuda-gpus
# install ollama as per https://github.com/ollama/ollama/blob/main/README.md#quickstart
curl -fsSL https://ollama.com/install.sh | sh
ollama run llama3.2
>>> what is your knowledge cutoff?
My knowledge cutoff is currently December 2023. This means that I have information up to that date, but I may not be aware of events, updates, or developments that have occurred after that time.
Install ChromaDB and connect it to ollama.
# install python deps
pip install --q chromadb
pip install --q unstructured langchain langchain-text-splitters
pip install --q "unstructured[all-docs]"
pip install --q flask
# Install the text embedding model
ollama pull nomic-embed-text
# Is ollama running? the CURL install add it as a service, I suspect
curl localhost:11434
Ollama is running
# Add a Markdown Document about the holiday schedule
curl --request POST \
--url http://localhost:8080/embed \
--header 'Content-Type: multipart/form-data' \
--form file=@/fall_schedule.md
{
"message": "File embedded successfully"
}
# Ask it a question about an event
curl --request POST \
--url http://localhost:8080/query \
--header 'Content-Type: application/json' \
--data '{ "query": "When is fall break?" }'
{
"message": "Fall break occurs from October 9-12."
}
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.