Environment Variables¶
All environment variables that affect Safe Synthesizer behavior. For runtime errors and OOM issues, see Program Runtime. For output quality and evaluation metrics, see Synthetic Data Quality.
Synthesis parameters (training.learning_rate, generation.num_records, etc.)
are set via YAML, CLI flags, or the Python SDK -- not environment variables.
Environment variables control infrastructure: where artifacts go, how models
are cached, and which network endpoints are used.
NSS Variables¶
| Variable | CLI flag | Purpose |
|---|---|---|
NSS_CONFIG |
--config |
Path to YAML config file |
NSS_ARTIFACTS_PATH |
--artifact-path |
Default artifact path |
NSS_LOG_FORMAT |
--log-format |
Log format (json or plain) |
NSS_LOG_FILE |
--log-file |
Log file path |
NSS_LOG_COLOR |
--log-color / --no-log-color |
Colorize console output (auto-detected from TTY) |
NSS_LOG_LEVEL |
-- | Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL, or DEBUG_DEPENDENCIES |
NSS_DATASET_REGISTRY |
--dataset-registry |
Dataset registry YAML path/URL |
NSS_WANDB_MODE |
--wandb-mode |
WandB mode (alias for WANDB_MODE) |
NSS_WANDB_PROJECT |
--wandb-project |
WandB project name (alias for WANDB_PROJECT) |
NSS_INFERENCE_ENDPOINT |
-- | LLM endpoint for PII column classification (default: https://integrate.api.nvidia.com/v1) |
NSS_INFERENCE_KEY |
-- | API key for the NSS_INFERENCE_ENDPOINT is required for column classification in both CLI and SDK. |
NIM_MODEL_ID |
-- | Column classification model ID |
LOCAL_FILES_ONLY |
-- | Set to true for offline mode (Unsloth, GLiNER) |
SAFE_SYNTHESIZER_CPU_COUNT |
-- | NER CPU processes |
Third-Party Variables¶
| Variable | Read by | Purpose |
|---|---|---|
HF_HOME |
Hugging Face Hub | Cache directory for model downloads |
HF_HUB_OFFLINE |
Hugging Face Hub | Set to 1 to error instead of downloading |
VLLM_ATTENTION_BACKEND |
vLLM | Override attention backend |
VLLM_CACHE_ROOT |
vLLM | vLLM internal cache directory (defaults to ~/.cache/vllm) |
WANDB_MODE |
WandB | Mode (online, offline, disabled) |
WANDB_PROJECT |
WandB | Project name |
WANDB_API_KEY |
WandB | API key for authentication |
Precedence¶
Infrastructure settings (artifact path, logging, WandB):
- CLI flags (
--artifact-path,--log-format, etc.) - Environment variables (
NSS_ARTIFACTS_PATH,NSS_LOG_FORMAT, etc.) - Built-in defaults
Hugging Face Cache¶
All model and tokenizer downloads go through Hugging Face Hub. The following variables control where downloads are stored and whether the network is used. For a step-by-step offline setup guide, see Running in Offline Environments.
HF_HOME¶
Sets the root cache directory for all Hugging Face downloads -- model weights, tokenizers, compiled attention kernels, and the SentenceTransformer used for evaluation.
Pre-Caching Models¶
To avoid runtime downloads, run the pipeline once in an environment with internet access, then copy or mount the populated cache in your target environment:
export HF_HOME=/shared/cache/huggingface
safe-synthesizer run --config config.yaml --data-source data.csv
What gets downloaded on first use:
- Model weights, config, and tokenizer (all backends, via HF Hub)
- Compiled attention kernels when
training.attn_implementationstarts withkernels-community/ - GLiNER NER model (PII replacement)
distiluse-base-multilingual-cased-v2(evaluation semantic similarity)- vLLM base model (generation)
Silent downloads on first use
All downloads happen silently on first use. If the first run is in an environment without internet access, connection errors will appear at whichever pipeline stage tries to download first.
HF_HUB_OFFLINE¶
When set, prevents all Hugging Face Hub network requests. Any attempt to access a model that is not already cached raises an error immediately.
Prefer this over LOCAL_FILES_ONLY for the most reliable offline experience --
see the warning under LOCAL_FILES_ONLY below.
LOCAL_FILES_ONLY¶
Skips network downloads for the Unsloth backend and GLiNER. Not respected by the HuggingFace training backend or vLLM.
Partial offline support
LOCAL_FILES_ONLY is not consistently supported across all backends.
Set HF_HUB_OFFLINE=1 combined with a pre-populated HF_HOME cache
for the most reliable offline experience.
VLLM_CACHE_ROOT¶
Sets the vLLM model cache directory.
Attention and Compute¶
GPU attention backend selection for the vLLM generation engine.
VLLM_ATTENTION_BACKEND¶
Controls the attention implementation used by the vLLM generation engine.
Safe Synthesizer sets this automatically when generation.attention_backend
is configured. Leave it unset unless you have a specific reason to override
vLLM's auto-detection.
Common values: FLASHINFER, FLASH_ATTN, TORCH_SDPA, TRITON_ATTN,
FLEX_ATTENTION.
PII and NER¶
NIM endpoint, API keys, and CPU parallelism for PII detection.
NSS_INFERENCE_ENDPOINT¶
The NIM/OpenAI-compatible endpoint used for PII column classification. Defaults
to https://integrate.api.nvidia.com/v1 when unset. Override for a custom endpoint:
export NSS_INFERENCE_ENDPOINT="https://your-llm-inference-endpoint"
export NSS_INFERENCE_KEY="your-api-key" # pragma: allowlist secret
When using the CLI or SDK: for column classification to work, set NSS_INFERENCE_KEY (and
NSS_INFERENCE_ENDPOINT only if you are not using the default URL).
To disable column classification entirely instead of pointing it at a local
endpoint, use the replace_pii.globals.classify.enable_classify config option.
PII classify config is deeply nested -- use YAML or SDK:
NSS_INFERENCE_KEY¶
API key for the NSS inference endpoint. Required for PII column classification when using the
CLI and SDK (with the default or custom NSS_INFERENCE_ENDPOINT).
NIM_MODEL_ID¶
Model ID sent to the NIM endpoint for PII column classification. Defaults to
qwen/qwen2.5-coder-32b-instruct.
SAFE_SYNTHESIZER_CPU_COUNT¶
Controls the number of CPU worker processes used for NER (PII replacement).
Defaults to max(1, cpu_count - 1) (one CPU left free), further capped so
there are at least 1,000 records per worker.
Container Usage¶
When running Safe Synthesizer in a Docker container, these variables are particularly important:
| Variable | Recommended Value | Why |
|---|---|---|
HF_HOME |
/workspace/.hf_cache |
Point at a bind-mounted host directory so model downloads persist across container runs |
HF_HUB_OFFLINE |
1 |
Set in air-gapped environments after pre-caching models |
VLLM_CACHE_ROOT |
/workspace/.vllm_cache |
Persist vLLM's internal cache if needed |
NSS_ARTIFACTS_PATH |
/workspace/artifacts |
Write artifacts to a mounted volume |
NSS_LOG_FORMAT |
json |
Structured logs for log aggregators; auto-detected in non-TTY containers |
NVIDIA_VISIBLE_DEVICES |
0 or all |
Select GPUs (set by --gpus flag, but can be overridden) |
Example:
docker run --gpus all --shm-size=1g \
-v $(pwd):/workspace \
-v ~/.cache/huggingface:/workspace/.hf_cache \
-e HF_HOME=/workspace/.hf_cache \
-e NSS_ARTIFACTS_PATH=/workspace/artifacts \
nss-gpu:latest run --config /workspace/config.yaml --data-source /workspace/data.csv
See Docker for full container setup and Makefile shortcuts.
- Running Safe Synthesizer -- pipeline execution, CLI commands, and artifacts
- Configuration Reference -- parameter tables
- Program Runtime -- runtime errors and OOM fixes