Dek: NVIDIA says Nemotron 3 Super is a 120B MoE hybrid with 12B active parameters, a 1‑million‑token context window, and open weights aimed at agentic workflows.
NVIDIA has announced and released Nemotron 3 Super on March 11, 2026, positioning it as an open‑weight model designed for agentic AI and long‑context reasoning. The company describes the model as a 120‑billion‑parameter Mixture‑of‑Experts (MoE) system with 12 billion active parameters at inference, and a native 1‑million‑token context window. NVIDIA says the weights, datasets, and recipes are open, with access via NVIDIA Build and Hugging Face.
A hybrid architecture built for long‑context agents
NVIDIA’s release emphasizes architecture choices aimed at throughput and efficiency:
- 120B total parameters with 12B active at inference, using an MoE design. (NVIDIA Blog; NVIDIA Developer Blog)
- Hybrid Mamba‑Transformer architecture, plus MoE/LatentMoE techniques to boost throughput. (NVIDIA Developer Blog)
- 1‑million‑token context window for long‑horizon reasoning and multi‑agent workflows. (NVIDIA Blog)
- Open weights, datasets, and recipes, positioning the release as an open model rather than a closed API. (NVIDIA Developer Blog)
- Availability via NVIDIA Build and Hugging Face, giving developers multiple access points. (NVIDIA Blog; VentureBeat)
These choices reflect NVIDIA’s broader push beyond GPUs into open‑model ecosystems, especially for enterprise teams building agentic systems that need long memory and high throughput. For context on agentic products moving into real workflows, see Tencent Launches WorkBuddy, a Full‑Scenario AI Agent Aimed at Execution. Public‑sector deployments are also emerging, such as Shenzhen’s Futian District Puts “Government Lobster” AI Agents Into Live Public‑Service Workflows.
Value line: By shipping a long‑context open‑weight model, NVIDIA is signaling that the battle for enterprise AI stacks will be fought on open‑model availability and tooling, not just hardware.
What’s still unclear
The release leaves several questions that will shape real‑world adoption:
- Independent benchmarking and comparative performance data remain limited outside NVIDIA’s own reporting.
- Licensing details under the Nemotron Open Model License could influence commercial usage.
- Inference cost and hardware requirements for a 120B MoE hybrid are still not clearly framed for typical enterprise budgets.
Uncertainty prompt: Will third‑party benchmarks and licensing terms make Nemotron 3 Super a default choice for enterprise agents, or will adoption stay narrow to NVIDIA‑optimized stacks?
Bottom line
Nemotron 3 Super is a clear attempt to make NVIDIA a first‑class player in open‑weight models, not just the hardware backbone. The headline specs—120B total parameters, 12B active, and a 1‑million‑token context window—are designed to attract teams building agentic systems that need scale and long memory. The next test is whether real‑world benchmarks, licensing, and deployment costs match the promise. For more ecosystem context, see MagicLab Raises RMB 500M, Plans Embodied‑AI Fund.
More in AI Signals coverage.
Sources
- NVIDIA Blog: https://blogs.nvidia.com/blog/nemotron-3-super-agentic-ai/
- NVIDIA Developer Blog: https://developer.nvidia.com/blog/introducing-nemotron-3-super-an-open-hybrid-mamba-transformer-moe-for-agentic-reasoning/
- VentureBeat: https://venturebeat.com/technology/nvidias-new-open-weights-nemotron-3-super-combines-three-different