Fine-tuning
The process of further training a pre-trained model on a smaller, task-specific dataset to improve its performance on that particular task or domain.
Why it matters
Fine-tuning lets you customize foundation models for your exact use case. The rise of parameter-efficient methods like LoRA has made it accessible to teams without massive GPU budgets.
When to fine-tune
Fine-tuning makes sense when you need to teach a model a specific style, format, or domain vocabulary that prompt engineering alone can't capture. Common use cases include custom tone-of-voice, specialized classification, and structured output formatting.
Fine-tuning vs. RAG
A common misconception is that fine-tuning teaches a model new facts. In practice, RAG is better for knowledge (it can be updated without retraining) while fine-tuning is better for behavior (teaching the model how to respond, not what to say).
Techniques
- Full fine-tuning — updates all model weights. Expensive but thorough.
- LoRA / QLoRA — only trains small adapter layers. Much cheaper, often 90%+ of the quality.
- RLHF — fine-tunes using human preference data to align model behavior.
- DPO — Direct Preference Optimization, a simpler alternative to RLHF.