utils
utils
¶
CLI utility functions for Safe Synthesizer.
This module provides utility functions for CLI commands including:
- Logging initialization
- Dataset loading
- Configuration merging
- Result saving
Functions:
| Name | Description |
|---|---|
common_setup |
Common setup for all run commands using unified CLISettings. |
merge_overrides |
Merge overrides into a SafeSynthesizerParameters object. |
Attributes:
| Name | Type | Description |
|---|---|---|
CLI_NESTED_FIELD_SEPARATOR |
Separator used to denote nested fields in CLI options. |
|
VERBOSITY_TO_LOG_LEVEL |
dict[int, Literal['INFO', 'DEBUG', 'DEBUG_DEPENDENCIES']]
|
Mapping from CLI verbosity level to log level. |
CLI_NESTED_FIELD_SEPARATOR = '__'
module-attribute
¶
Separator used to denote nested fields in CLI options.
This must match the field_separator passed to pydantic_options and the
field_sep used by parse_overrides; otherwise a Click option such as
--data__holdout=0.1 will not become {"data": {"holdout": 0.1}}.
VERBOSITY_TO_LOG_LEVEL = {0: 'INFO', 1: 'DEBUG', 2: 'DEBUG_DEPENDENCIES'}
module-attribute
¶
Mapping from CLI verbosity level to log level.
common_setup(settings, resume=False, phase=None, auto_discover_adapter=False, wandb_resume_job_id=None, skip_wandb=False, quiet=False, run_name=None)
¶
Common setup for all run commands using unified CLISettings.
The setup order is: 1. Create Workdir (establishes artifact paths) 2. Initialize logging (using workdir.log_file) 3. Create DatasetRegistry from settings.dataset_registry if present, otherwise create an empty registry 4. Load dataset from registry if settings.data_source is a known name, otherwise from data_source 5. Load config with overrides from dataset overrides and command line overrides 6. Initialize wandb
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
settings
|
'CLISettings'
|
Unified CLI settings (includes all config from env vars and CLI args) |
required |
resume
|
bool
|
If True, attempt to resume from an existing workdir |
False
|
phase
|
str | None
|
The current phase (train, generate, end_to_end) |
None
|
auto_discover_adapter
|
bool
|
If True and resume=True, auto-discover the latest trained adapter |
False
|
wandb_resume_job_id
|
str | None
|
Optional wandb run ID or path to file containing the ID to resume |
None
|
skip_wandb
|
bool
|
If |
False
|
quiet
|
bool
|
If |
False
|
run_name
|
str | None
|
Explicit run name for the artifact directory (e.g. |
None
|
Returns:
| Type | Description |
|---|---|
'CategoryLogger'
|
Tuple of (logger, config, dataframe, workdir). For generate-only runs with |
SafeSynthesizerParameters
|
cached datasets, dataframe may be None (loaded from cached files by SafeSynthesizer). |
Source code in src/nemo_safe_synthesizer/cli/utils.py
198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 | |
merge_overrides(config_path, overrides)
¶
Merge overrides into a SafeSynthesizerParameters object.
If config_path is None, use the overrides to create a new SafeSynthesizerParameters object. Otherwise, merge the overrides into the config file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str | Path | None
|
Path to config file (YAML) |
required |
overrides
|
dict
|
Dictionary of override values |
required |
Returns:
| Type | Description |
|---|---|
SafeSynthesizerParameters
|
Merged SafeSynthesizerParameters |