utils
utils
¶
Shared utilities for the data actions framework.
Provides ActionCtx (execution context with state and dependency injection),
TransformsUtil (wrapper around the transforms_v2 engine), helper types
(MetadataColumns, TransformsUpdate), and subclass-discovery functions.
Classes:
| Name | Description |
|---|---|
MetadataColumns |
Internal column names injected during validation phases. |
TransformsUpdate |
Typed wrapper for a single transforms_v2 update step. |
TransformsUtil |
Wrapper around a transforms_v2 |
DataSource |
Abstract base for pluggable data sources used by |
ActionCtx |
Execution context shared across all action invocations. |
Functions:
| Name | Description |
|---|---|
type_alias_fn |
Pydantic alias generator that maps |
remove_metadata_columns_from_df |
Drop all |
remove_metadata_columns_from_records |
Return a copy of each record dict with |
is_abstract |
Return True if the class has abstract methods or directly inherits |
all_subclasses |
Recursively collect all subclasses of |
concrete_subclasses |
Return all non-abstract recursive subclasses of |
guess_datetime_format |
Infer a |
MetadataColumns
¶
Bases: StrEnum
Internal column names injected during validation phases.
Attributes:
| Name | Type | Description |
|---|---|---|
INDEX |
Temporary index for mapping back to pre-transformed records. |
|
REJECT_REASON |
Reason a row was rejected during batch validation. |
TransformsUpdate
pydantic-model
¶
Bases: BaseModel
Typed wrapper for a single transforms_v2 update step.
Fields:
TransformsUtil(seed=None)
¶
Wrapper around a transforms_v2 Environment for executing column updates and drop conditions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed
|
Optional[int]
|
Random seed passed to the underlying |
None
|
Source code in src/nemo_safe_synthesizer/data_processing/actions/utils.py
DataSource
pydantic-model
¶
Bases: BaseModel, ABC
Abstract base for pluggable data sources used by GenDataSource actions.
Subclasses implement generate_data to populate a column in an existing
DataFrame. generate_records is a convenience wrapper that creates an
empty DataFrame first.
Config:
alias_generator:type_alias_fn
ActionCtx(**data)
pydantic-model
¶
Bases: BaseModel
Execution context shared across all action invocations.
Provides a random seed, a state dictionary for cross-phase communication,
and a lazily-initialized TransformsUtil for expression evaluation.
Fields:
Source code in src/nemo_safe_synthesizer/data_processing/actions/utils.py
type_alias_fn(field_name)
¶
Pydantic alias generator that maps type_ to type for YAML compatibility.
remove_metadata_columns_from_df(df)
¶
Drop all MetadataColumns from the DataFrame in-place.
Source code in src/nemo_safe_synthesizer/data_processing/actions/utils.py
remove_metadata_columns_from_records(records)
¶
Return a copy of each record dict with MetadataColumns keys removed.
Source code in src/nemo_safe_synthesizer/data_processing/actions/utils.py
is_abstract(c)
¶
Return True if the class has abstract methods or directly inherits ABC.
all_subclasses(klass)
¶
Recursively collect all subclasses of klass.
Source code in src/nemo_safe_synthesizer/data_processing/actions/utils.py
concrete_subclasses(klass)
¶
Return all non-abstract recursive subclasses of klass.
Used by pydantic discriminated unions (e.g., ActionT) to
auto-discover instantiable action types for validation and schema
generation.
Source code in src/nemo_safe_synthesizer/data_processing/actions/utils.py
guess_datetime_format(datetime_str)
¶
Infer a strftime-compatible format string from a date string, or None.