Your team needs to build a system where an LLM answers questions about your company's internal documentation, which is updated weekly. A colleague proposes fine-tuning the model on the docs. What are the tradeoffs between fine-tuning and RAG here, and what would you recommend?
tldr
Fine-tuning changes weights — reliable for format, style, and behavior; unreliable for dynamic or large knowledge bases because facts don't have dedicated "slots" in weight space. RAG injects documents at query time — updatable, citable, and accurate for large corpora. For weekly-updating docs, RAG wins on update cost alone. Fine-tuned facts can hallucinate more confidently than base model facts — a real production trap. Production systems often combine both.
follow-up
- What is HyDE (Hypothetical Document Embeddings) and when does it improve retrieval quality?
- How would you evaluate whether a RAG system is grounded vs hallucinating from model weights?
- How does LoRA fine-tuning change the cost calculus for updating a model on new domain data?