Skip to content

NVIDIA NeMo Framework

NVIDIA NeMo Framework is an end-to-end training framework for large language models (LLMs), multi-modal models and speech models designed to run on NVIDIA accelerated infrastructure. It enables seamless scaling of both pre-training and post-training workloads from single GPU to thousand-node clusters for Hugging Face, Megatron, and PyTorch models.

This site hosts developer updates, tutorials, and insights about NeMo's latest core components and innovations.

Latest Blog Posts

Guide to Fine-tune Nvidia NeMo models with Granary Data

August 13, 2025

NeMo-RL: Journey of Optimizing Weight Transfer in Large MoE Models by 10x

August 12, 2025

🚀 NeMo Framework Now Supports Google Gemma 3n: Efficient Multimodal Fine-tuning Made Simple

August 11, 2025


View all blog posts →

NeMo Framework Components

  • 🚀 NeMo-RL


    Scalable toolkit for efficient model reinforcement learning and post-training. Includes algorithms like DPO, GRPO, and support for everything from single-GPU prototypes to thousand-GPU deployments.

    🚀 GitHub Repository

    📖 Documentation

  • 🚀 NeMo-Automodel


    Day-0 support for any Hugging Face model leveraging PyTorch native functionalities while providing performance and memory optimized training and inference recipes.

    🚀 GitHub Repository

    📖 Documentation

License

Apache 2.0 licensed with third-party attributions documented in each repository.