NVIDIA Nemotron 3

NVIDIA entering the open model race with a purpose-built agentic AI family is worth watching — early adopters include Cursor, Perplexity, and Figure AI.

LLM·Open-source·Agentic

build.nvidia.com

Our Take

What It Is

NVIDIA Nemotron 3 is an open-weight model family purpose-built for agentic AI. The architecture uses mixture-of-experts with 120 billion total parameters but only 12 billion active per inference, making it efficient for deployment. The family targets three domains: robotics (with Agility Robotics, Figure AI, Skild AI as early adopters), enterprise workflows (ServiceNow, Uber), and developer tooling (Cursor, Perplexity). The models achieved 85.6% on PinchBench and are expected to reach general availability in the first half of 2026.

Why It Matters

Nemotron 3 enters at Watch because it's pre-GA, but NVIDIA's entry into the open model space deserves attention. The thesis is different from Meta's Llama or Google's Gemma: Nemotron is designed from the ground up for agentic workloads — tool use, planning, and multi-step execution. The mixture-of-experts architecture (12B active out of 120B total) is explicitly an efficiency play for deployment at the edge and in enterprise environments.

The early adopter list is the interesting signal. Cursor and Perplexity choosing Nemotron alongside their existing model options suggests it offers something useful for agentic coding and research workflows. NVIDIA's GPU monopoly means they can optimise the models for their own hardware in ways other model providers can't match.

Key Developments

Mar 2026: NVIDIA announces Nemotron 3 family at GTC 2026. Open weights, 120B params / 12B active (MoE).
Feb 2026: Early adopter announcements: Cursor, Perplexity, ServiceNow, Uber, Agility Robotics, Figure AI.
Feb 2026: 85.6% on PinchBench. Expected GA first half 2026.

What to Watch

GA release timing and benchmark performance against established open models (Llama 4, DeepSeek) will determine whether this is a meaningful player or a GPU-company side project. Watch for NVIDIA-specific optimisations — if Nemotron runs significantly faster on NVIDIA hardware than competing open models, it creates a hardware-model bundle that's compelling for enterprises already committed to NVIDIA GPUs. The robotics angle (Agility, Figure AI) is a differentiated use case worth tracking separately.

Strengths

Agentic design: Purpose-built for tool use, planning, and multi-step execution — not a general model adapted for agents.
Efficiency architecture: 12B active parameters out of 120B total means frontier-adjacent capability with practical deployment costs.
Hardware optimisation: NVIDIA can optimise for their own GPUs in ways third-party model providers cannot match.
Open weights: Full open-weight release enables fine-tuning, self-hosting, and inspection of model behaviour.

Considerations

Pre-GA status: Not yet generally available. Early benchmarks are promising but production readiness is unproven.
NVIDIA ecosystem affinity: Likely optimised for NVIDIA hardware. Performance on AMD or other accelerators may not match.
Crowded space: Competing with established open models (Llama 4, DeepSeek R1) that have larger communities and more production evidence.
Enterprise focus: The robotics and enterprise targeting may limit community contribution compared to more general-purpose open models.

Resources

Articles

NVIDIA AI Blogblogs.nvidia.com

NVIDIA's announcements and technical details on the Nemotron family.

Documentation

Nemotron 3 on NVIDIA Buildbuild.nvidia.com

Model access, API documentation, and deployment guides.

Repositories

Nemotron on HuggingFacehuggingface.co

Open-weight model downloads and community contributions.

More in Models & Platforms

NVIDIA Nemotron 3· DeepSeek R1· Llama 4· Mistral· Reasoning Models· Claude Opus 4· GPT-5 Family· Gemini 3.1 Pro· Hugging Face· Amazon Nova

Back to AI Radar