Skip to content

NeMo Safe Synthesizer

NeMo Safe Synthesizer creates private, safe versions of sensitive tabular datasets -- entirely synthetic data with no one-to-one mapping to your original records. It uses LLM fine-tuning with optional differential privacy to produce high-quality datasets that preserve the statistical properties and utility of your data for downstream AI tasks while ensuring privacy compliance and protecting sensitive information.

Key Features

  • Privacy-first synthetic data -- PII detection and replacement, optional differential privacy while fine-tuning via Opacus
  • LLM fine-tuning -- LoRA fine-tuning optimized for tabular data, including numeric, categorical, and text columns
  • Fast inference -- vLLM-powered generation with optional structured output enforcement
  • Comprehensive evaluation -- Privacy and quality metrics in an in-depth HTML report
  • Flexible interfaces -- CLI for scripting, Python SDK for programmatic workflows, YAML configuration

System Requirements

NeMo Safe Synthesizer requires a Linux machine with an NVIDIA GPU (A100 80GB+ recommended) and CUDA 12.8+ to run the training and generation pipeline. macOS, Windows, and Apple Silicon are not supported for pipeline execution. A CPU-only install is available for development and configuration validation -- see Getting Started.

Next Steps

  • Getting Started


    Install the package, set up your environment, and run your first synthetic data pipeline in minutes.

    Getting Started

  • Product Overview


    Learn about the pipeline steps: replace PII, synthesize data, evaluate.

    Product Overview

  • Tutorials


    Follow hands-on tutorials to generate synthetic data.

    Tutorials

  • User Guide


    Configure and run the pipeline via YAML, CLI, SDK, or environment variables.

    User Guide

  • Developer Guide


    Browse the auto-generated API reference and dive into the architecture details.

    Developer Guide

  • Developer Notes


    Read developer blog posts and check release notes.

    Developer Notes

Contact

License

NeMo Safe Synthesizer is licensed under the Apache License 2.0.