Glossary
RAG (retrieval-augmented generation)
RAG (retrieval-augmented generation) means the AI first looks things up in a trusted source (your documents, a database, the web) and then writes its answer from what it found. It cuts hallucinations and keeps answers current.
An LLM on its own answers from memory: everything it absorbed during training, frozen at some past date. RAG adds an open-book step: before answering, the system retrieves the relevant pages from a chosen source and hands them to the model along with your question.
Picture a pharmacist. Asked about a rare interaction, a good one doesn’t recite from memory. They pull up the official database, read, then answer. Same competence, but grounded in a source that’s current and checkable.
For you this means two things. Answers can cite documents the model never saw in training (your contracts, today’s news, a product manual). And hallucinations drop sharply, though they don’t vanish: the model can still misread what it retrieved, so source links remain your friend.
Where you’ll meet this
Every time an assistant says “searching the web…” before answering: that’s RAG. Also: ChatGPT and Claude projects that answer from uploaded files, Copilot answering from your company’s SharePoint, NotebookLM working over your sources, and customer-service bots that actually quote the real return policy.