Ollama RAG

An “Ollama RAG app” is a web service that uses Ollama and a Vector DB to provide “Retrieval-Augmented Generation”. Here we set up a local LLM instance with ollama and chroma db for result augmentation.

Rough Notes from the initial test, mostly based from hackernoon.

# Check that we have Video Card Support
lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06)
01:00.0 VGA compatible controller: NVIDIA Corporation GM107GLM [Quadro M1000M] (rev a2)

# Verify "Quadro" supports compute 5 at https://developer.nvidia.com/cuda-gpus

# install ollama as per https://github.com/ollama/ollama/blob/main/README.md#quickstart

curl -fsSL https://ollama.com/install.sh | sh

ollama run llama3.2
 
>>> what is your knowledge cutoff?
My knowledge cutoff is currently December 2023. This means that I have information up to that date, but I may not be aware of events, updates, or developments that have occurred after that time.

Install ChromaDB and connect it to ollama.

# install python deps
pip install --q chromadb
pip install --q unstructured langchain langchain-text-splitters
pip install --q "unstructured[all-docs]"
pip install --q flask

# Install the text embedding model
ollama pull nomic-embed-text

# Is ollama running? the CURL install add it as a service, I suspect
curl localhost:11434
Ollama is running


# Add a Markdown Document about the holiday schedule

curl --request POST \
  --url http://localhost:8080/embed \
  --header 'Content-Type: multipart/form-data' \
  --form file=@/fall_schedule.md
  
{
  "message": "File embedded successfully"
}

# Ask it a question about an event

 curl --request POST \
  --url http://localhost:8080/query \
  --header 'Content-Type: application/json' \
  --data '{ "query": "When is fall break?" }'
{
  "message": "Fall break occurs from October 9-12."
}

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Last modified February 3, 2026: WIP: snapshot before WireGuard and Elastic Stack link refactor (a75df5e)