Data Designer Configuration
DataDesignerConfig is the main configuration object for builder datasets with Data Designer. It is a declarative configuration for defining the dataset you want to generate column-by-column, including options for dataset post-processing, validation, and profiling.
Generally, you should use the DataDesignerConfigBuilder to build your configuration, but you can also build it manually by instantiating the DataDesignerConfig class directly.
Classes:
| Name | Description |
|---|---|
DataDesignerConfig |
Configuration for NeMo Data Designer. |
DataDesignerConfig
Bases: ExportableConfigBase
Configuration for NeMo Data Designer.
This class defines the main configuration structure for NeMo Data Designer, which orchestrates the generation of synthetic data.
Attributes:
| Name | Type | Description |
|---|---|---|
columns |
list[Annotated[ColumnConfigT, Field(discriminator='column_type')]]
|
Required list of column configurations defining how each column should be generated. Must contain at least one column. |
model_configs |
Optional[list[ModelConfig]]
|
Optional list of model configurations for LLM-based generation. Each model config defines the model, provider, and inference parameters. |
seed_config |
Optional[SeedConfig]
|
Optional seed dataset settings to use for generation. |
constraints |
Optional[list[ColumnConstraintT]]
|
Optional list of column constraints. |
profilers |
Optional[list[ColumnProfilerConfigT]]
|
Optional list of column profilers for analyzing generated data characteristics. |