Models & Platforms

Fine-tuning

The process of further training a pre-trained model on a smaller, task-specific dataset to improve its performance on that particular task or domain.

Why it matters

Fine-tuning lets you customize foundation models for your exact use case. The rise of parameter-efficient methods like LoRA has made it accessible to teams without massive GPU budgets.

When to fine-tune

Fine-tuning makes sense when you need to teach a model a specific style, format, or domain vocabulary that prompt engineering alone can't capture. Common use cases include custom tone-of-voice, specialized classification, and structured output formatting.

Fine-tuning vs. RAG

A common misconception is that fine-tuning teaches a model new facts. In practice, RAG is better for knowledge (it can be updated without retraining) while fine-tuning is better for behavior (teaching the model how to respond, not what to say).

Techniques

  • Full fine-tuning — updates all model weights. Expensive but thorough.
  • LoRA / QLoRA — only trains small adapter layers. Much cheaper, often 90%+ of the quality.
  • RLHF — fine-tunes using human preference data to align model behavior.
  • DPO — Direct Preference Optimization, a simpler alternative to RLHF.