Large Language Model
A neural network trained on massive text corpora that can understand and generate human language, typically with billions of parameters.
Why it matters
LLMs are the core technology behind modern AI assistants, copilots, and automation. Understanding their capabilities and limitations is essential for building reliable AI products.
What defines "large"
The term originally distinguished models with billions of parameters (GPT-3 at 175B) from smaller predecessors. Today the boundary is blurry — models like Mistral 7B punch well above their weight class. What matters more than raw size is the quality and diversity of training data, architecture choices, and post-training alignment.
Foundation vs. fine-tuned
A foundation model (or base model) is the raw pretrained LLM. Most production use cases rely on instruction-tuned variants that have been further trained to follow directions, and RLHF-aligned versions that are safer and more helpful. Fine-tuning on domain-specific data is a separate step that specializes the model for particular tasks.
Key capabilities
- Text generation, summarization, and translation
- Code generation and debugging
- Reasoning and analysis (especially with chain-of-thought prompting)
- Tool use and function calling
- Multimodal understanding (vision, audio) in newer models