mlprep
mlprep/ML Breadthhard12 min

Your team needs to build a system where an LLM answers questions about your company's internal documentation, which is updated weekly. A colleague proposes fine-tuning the model on the docs. What are the tradeoffs between fine-tuning and RAG here, and what would you recommend?

formulate your answer, then —

tldr

Fine-tuning changes weights — reliable for format, style, and behavior; unreliable for dynamic or large knowledge bases because facts don't have dedicated "slots" in weight space. RAG injects documents at query time — updatable, citable, and accurate for large corpora. For weekly-updating docs, RAG wins on update cost alone. Fine-tuned facts can hallucinate more confidently than base model facts — a real production trap. Production systems often combine both.

follow-up

  • What is HyDE (Hypothetical Document Embeddings) and when does it improve retrieval quality?
  • How would you evaluate whether a RAG system is grounded vs hallucinating from model weights?
  • How does LoRA fine-tuning change the cost calculus for updating a model on new domain data?