Environment Variables¶

All environment variables that affect Safe Synthesizer behavior. For runtime errors and OOM issues, see Program Runtime. For output quality and evaluation metrics, see Synthetic Data Quality.

Synthesis parameters (training.learning_rate, generation.num_records, etc.) are set via YAML, CLI flags, or the Python SDK -- not environment variables. Environment variables control infrastructure: where artifacts go, how models are cached, and which network endpoints are used.

NSS Variables¶

Variable	CLI flag	Purpose
`NSS_CONFIG`	`--config`	Path to YAML config file
`NSS_ARTIFACTS_PATH`	`--artifact-path`	Default artifact path
`NSS_LOG_FORMAT`	`--log-format`	Log format (`json` or `plain`)
`NSS_LOG_FILE`	`--log-file`	Log file path
`NSS_LOG_COLOR`	`--log-color` / `--no-log-color`	Colorize console output (auto-detected from TTY)
`NSS_LOG_LEVEL`	--	Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`, or `DEBUG_DEPENDENCIES`
`NSS_DATASET_REGISTRY`	`--dataset-registry`	Dataset registry YAML path/URL
`NSS_WANDB_MODE`	`--wandb-mode`	WandB mode (alias for `WANDB_MODE`)
`NSS_WANDB_PROJECT`	`--wandb-project`	WandB project name (alias for `WANDB_PROJECT`)
`NSS_INFERENCE_ENDPOINT`	--	LLM endpoint for PII column classification (default: `https://integrate.api.nvidia.com/v1`)
`NSS_INFERENCE_KEY`	--	API key for the `NSS_INFERENCE_ENDPOINT` is required for column classification in both CLI and SDK.
`NIM_MODEL_ID`	--	Column classification model ID
`LOCAL_FILES_ONLY`	--	Set to `true` for offline mode (Unsloth, GLiNER)
`SAFE_SYNTHESIZER_CPU_COUNT`	--	NER CPU processes

Third-Party Variables¶

Variable	Read by	Purpose
`HF_HOME`	Hugging Face Hub	Cache directory for model downloads
`HF_HUB_OFFLINE`	Hugging Face Hub	Set to `1` to error instead of downloading
`VLLM_ATTENTION_BACKEND`	vLLM	Override attention backend
`VLLM_CACHE_ROOT`	vLLM	vLLM internal cache directory (defaults to `~/.cache/vllm`)
`WANDB_MODE`	WandB	Mode (`online`, `offline`, `disabled`)
`WANDB_PROJECT`	WandB	Project name
`WANDB_API_KEY`	WandB	API key for authentication

Precedence¶

Infrastructure settings (artifact path, logging, WandB):

CLI flags (--artifact-path, --log-format, etc.)
Environment variables (NSS_ARTIFACTS_PATH, NSS_LOG_FORMAT, etc.)
Built-in defaults

Hugging Face Cache¶

All model and tokenizer downloads go through Hugging Face Hub. The following variables control where downloads are stored and whether the network is used. For a step-by-step offline setup guide, see Running in Offline Environments.

`HF_HOME`¶

Sets the root cache directory for all Hugging Face downloads -- model weights, tokenizers, compiled attention kernels, and the SentenceTransformer used for evaluation.

export HF_HOME=/shared/cache/huggingface

Pre-Caching Models¶

To avoid runtime downloads, run the pipeline once in an environment with internet access, then copy or mount the populated cache in your target environment:

export HF_HOME=/shared/cache/huggingface
safe-synthesizer run --config config.yaml --data-source data.csv

What gets downloaded on first use:

Model weights, config, and tokenizer (all backends, via HF Hub)
Compiled attention kernels when training.attn_implementation starts with kernels-community/
GLiNER NER model (PII replacement)
distiluse-base-multilingual-cased-v2 (evaluation semantic similarity)
vLLM base model (generation)

Silent downloads on first use

All downloads happen silently on first use. If the first run is in an environment without internet access, connection errors will appear at whichever pipeline stage tries to download first.

`HF_HUB_OFFLINE`¶

When set, prevents all Hugging Face Hub network requests. Any attempt to access a model that is not already cached raises an error immediately.

export HF_HUB_OFFLINE=1

Prefer this over LOCAL_FILES_ONLY for the most reliable offline experience -- see the warning under LOCAL_FILES_ONLY below.

`LOCAL_FILES_ONLY`¶

Skips network downloads for the Unsloth backend and GLiNER. Not respected by the HuggingFace training backend or vLLM.

export LOCAL_FILES_ONLY=true

Partial offline support

LOCAL_FILES_ONLY is not consistently supported across all backends. Set HF_HUB_OFFLINE=1 combined with a pre-populated HF_HOME cache for the most reliable offline experience.

`VLLM_CACHE_ROOT`¶

Sets the vLLM model cache directory.

export VLLM_CACHE_ROOT=/shared/cache/vllm

Attention and Compute¶

GPU attention backend selection for the vLLM generation engine.

`VLLM_ATTENTION_BACKEND`¶

Controls the attention implementation used by the vLLM generation engine. Safe Synthesizer sets this automatically when generation.attention_backend is configured. Leave it unset unless you have a specific reason to override vLLM's auto-detection.

export VLLM_ATTENTION_BACKEND=FLASH_ATTN

Common values: FLASHINFER, FLASH_ATTN, TORCH_SDPA, TRITON_ATTN, FLEX_ATTENTION.

PII and NER¶

NIM endpoint, API keys, and CPU parallelism for PII detection.

`NSS_INFERENCE_ENDPOINT`¶

The NIM/OpenAI-compatible endpoint used for PII column classification. Defaults to https://integrate.api.nvidia.com/v1 when unset. Override for a custom endpoint:

export NSS_INFERENCE_ENDPOINT="https://your-llm-inference-endpoint"
export NSS_INFERENCE_KEY="your-api-key"  # pragma: allowlist secret

When using the CLI or SDK: for column classification to work, set NSS_INFERENCE_KEY (and NSS_INFERENCE_ENDPOINT only if you are not using the default URL).

To disable column classification entirely instead of pointing it at a local endpoint, use the replace_pii.globals.classify.enable_classify config option. PII classify config is deeply nested -- use YAML or SDK:

Config referenceSDK

replace_pii:
  globals:
    classify:
      enable_classify: false

from nemo_safe_synthesizer.config.replace_pii import PiiReplacerConfig

pii_config = PiiReplacerConfig.get_default_config()
pii_config.globals.classify.enable_classify = False

synthesizer = (
    SafeSynthesizer(config)
    .with_data_source("data.csv")
    .with_replace_pii(config=pii_config)
)

`NSS_INFERENCE_KEY`¶

API key for the NSS inference endpoint. Required for PII column classification when using the CLI and SDK (with the default or custom NSS_INFERENCE_ENDPOINT).

`NIM_MODEL_ID`¶

Model ID sent to the NIM endpoint for PII column classification. Defaults to qwen/qwen2.5-coder-32b-instruct.

`SAFE_SYNTHESIZER_CPU_COUNT`¶

Controls the number of CPU worker processes used for NER (PII replacement).

export SAFE_SYNTHESIZER_CPU_COUNT=4

Defaults to max(1, cpu_count - 1) (one CPU left free), further capped so there are at least 1,000 records per worker.

Container Usage¶

When running Safe Synthesizer in a Docker container, these variables are particularly important:

Variable	Recommended Value	Why
`HF_HOME`	`/workspace/.hf_cache`	Point at a bind-mounted host directory so model downloads persist across container runs
`HF_HUB_OFFLINE`	`1`	Set in air-gapped environments after pre-caching models
`VLLM_CACHE_ROOT`	`/workspace/.vllm_cache`	Persist vLLM's internal cache if needed
`NSS_ARTIFACTS_PATH`	`/workspace/artifacts`	Write artifacts to a mounted volume
`NSS_LOG_FORMAT`	`json`	Structured logs for log aggregators; auto-detected in non-TTY containers
`NVIDIA_VISIBLE_DEVICES`	`0` or `all`	Select GPUs (set by `--gpus` flag, but can be overridden)

Example:

docker run --gpus all --shm-size=1g \
  -v $(pwd):/workspace \
  -v ~/.cache/huggingface:/workspace/.hf_cache \
  -e HF_HOME=/workspace/.hf_cache \
  -e NSS_ARTIFACTS_PATH=/workspace/artifacts \
  nss-gpu:latest run --config /workspace/config.yaml --data-source /workspace/data.csv

See Docker for full container setup and Makefile shortcuts.

Running Safe Synthesizer -- pipeline execution, CLI commands, and artifacts
Configuration Reference -- parameter tables
Program Runtime -- runtime errors and OOM fixes

Environment Variables¶

NSS Variables¶

Third-Party Variables¶

Precedence¶

Hugging Face Cache¶

HF_HOME¶

Pre-Caching Models¶

HF_HUB_OFFLINE¶

LOCAL_FILES_ONLY¶

VLLM_CACHE_ROOT¶