utils
utils
¶
CLI utility functions for Safe Synthesizer.
This module provides utility functions for CLI commands including:
- Logging initialization
- Dataset loading
- Configuration merging
- Result saving
Functions:
| Name | Description |
|---|---|
common_setup |
Common setup for all run commands using unified CLISettings. |
merge_overrides |
Merge overrides into a SafeSynthesizerParameters object. |
Attributes:
| Name | Type | Description |
|---|---|---|
CLI_NESTED_FIELD_SEPARATOR |
Separator used to denote nested fields in CLI options. e.g., --data__holdout=0.1 |
|
VERBOSITY_TO_LOG_LEVEL |
dict[int, Literal['INFO', 'DEBUG', 'DEBUG_DEPENDENCIES']]
|
Mapping from CLI verbosity level to log level. |
CLI_NESTED_FIELD_SEPARATOR = '__'
module-attribute
¶
Separator used to denote nested fields in CLI options. e.g., --data__holdout=0.1
VERBOSITY_TO_LOG_LEVEL = {0: 'INFO', 1: 'DEBUG', 2: 'DEBUG_DEPENDENCIES'}
module-attribute
¶
Mapping from CLI verbosity level to log level.
common_setup(settings, resume=False, phase=None, auto_discover_adapter=False, wandb_resume_job_id=None)
¶
Common setup for all run commands using unified CLISettings.
The setup order is: 1. Create Workdir (establishes artifact paths) 2. Initialize logging (using workdir.log_file) 3. Create DatasetRegistry from settings.dataset_registry if present, otherwise create an empty registry 4. Load dataset from registry if settings.data_source is a known name, otherwise from data_source 5. Load config with overrides from dataset overrides and command line overrides 6. Initialize wandb
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
settings
|
'CLISettings'
|
Unified CLI settings (includes all config from env vars and CLI args) |
required |
resume
|
bool
|
If True, attempt to resume from an existing workdir |
False
|
phase
|
str | None
|
The current phase (train, generate, end_to_end) |
None
|
auto_discover_adapter
|
bool
|
If True and resume=True, auto-discover the latest trained adapter |
False
|
wandb_resume_job_id
|
str | None
|
Optional wandb run ID or path to file containing the ID to resume |
None
|
Returns:
| Type | Description |
|---|---|
'CategoryLogger'
|
Tuple of (logger, config, dataframe, workdir). For generate-only runs with |
SafeSynthesizerParameters
|
cached datasets, dataframe may be None (loaded from cached files by SafeSynthesizer). |
Source code in src/nemo_safe_synthesizer/cli/utils.py
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 | |
merge_overrides(config_path, overrides)
¶
Merge overrides into a SafeSynthesizerParameters object.
If config_path is None, use the overrides to create a new SafeSynthesizerParameters object. Otherwise, merge the overrides into the config file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str | Path | None
|
Path to config file (YAML) |
required |
overrides
|
dict
|
Dictionary of override values |
required |
Returns:
| Type | Description |
|---|---|
SafeSynthesizerParameters
|
Merged SafeSynthesizerParameters |