NVIDIA NeMo Framework¶

NVIDIA NeMo Framework is an end-to-end training framework for large language models (LLMs), multi-modal models and speech models designed to run on NVIDIA accelerated infrastructure. It enables seamless scaling of both pre-training and post-training workloads from single GPU to thousand-node clusters for Hugging Face, Megatron, and PyTorch models.

This site hosts developer updates, tutorials, and insights about NeMo's latest core components and innovations.

Latest Blog Posts¶

Guide to Fine-tune Nvidia NeMo models with Granary Data ¶

August 13, 2025

NeMo-RL: Journey of Optimizing Weight Transfer in Large MoE Models by 10x ¶

August 12, 2025

🚀 NeMo Framework Now Supports Google Gemma 3n: Efficient Multimodal Fine-tuning Made Simple ¶

August 11, 2025

View all blog posts →

NeMo Framework Components¶

🚀 NeMo-RL

Scalable toolkit for efficient model reinforcement learning and post-training. Includes algorithms like DPO, GRPO, and support for everything from single-GPU prototypes to thousand-GPU deployments.

🚀 GitHub Repository

📖 Documentation
🚀 NeMo-Automodel

Day-0 support for any Hugging Face model leveraging PyTorch native functionalities while providing performance and memory optimized training and inference recipes.

🚀 GitHub Repository

📖 Documentation

License¶

Apache 2.0 licensed with third-party attributions documented in each repository.