evaluation_dataset
evaluation_dataset
¶
Classes:
| Name | Description |
|---|---|
EvaluationDataset |
Paired reference and output dataframes prepared for evaluation. |
EvaluationDataset
pydantic-model
¶
Bases: BaseModel
Paired reference and output dataframes prepared for evaluation.
On construction the validator computes per-column EvaluationField
instances, counts memorized lines, and records dataset dimensions.
Use from_dataframes to build an instance with optional column/row
subsampling.
Config:
arbitrary_types_allowed:True
Fields:
-
reference(DataFrame) -
output(DataFrame) -
test(DataFrame | None) -
reference_rows(int) -
reference_cols(int) -
output_rows(int) -
output_cols(int) -
memorized_lines(int) -
column_statistics(dict[str, ColumnStatistics] | None) -
evaluation_fields(list[EvaluationField])
Validators:
-
validate
reference = pd.DataFrame()
pydantic-field
¶
Training (reference) dataframe.
output = pd.DataFrame()
pydantic-field
¶
Synthetic (output) dataframe.
test = None
pydantic-field
¶
Optional holdout dataframe for text-similarity and privacy metrics.
reference_rows = 0
pydantic-field
¶
Row count of the reference dataframe.
reference_cols = 0
pydantic-field
¶
Column count of the reference dataframe.
output_rows = 0
pydantic-field
¶
Row count of the output dataframe.
output_cols = 0
pydantic-field
¶
Column count of the output dataframe.
memorized_lines = 0
pydantic-field
¶
Number of exact row matches between reference and output.
column_statistics = None
pydantic-field
¶
Per-column PII entity counts and transform metadata.
evaluation_fields = list()
pydantic-field
¶
Per-column evaluation metadata and distribution scores.
check_dataframe(df, df_name)
staticmethod
¶
Raise ValueError if df is None or empty.
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_dataset.py
get_columns_of_type(types, mode='reference')
¶
Return column names whose FieldType is in types.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
types
|
set[FieldType]
|
Set of |
required |
mode
|
Which dataframe's field features to inspect --
|
'reference'
|
Returns:
| Type | Description |
|---|---|
list[str]
|
List of matching column names. |
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_dataset.py
get_tabular_columns(mode='reference')
¶
Return columns classified as binary, categorical, or numeric.
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_dataset.py
get_nominal_columns(mode='reference')
¶
Return columns classified as binary or categorical.
get_text_columns(mode='reference')
¶
subsample_columns(reference, output, test=None, target_column_count=DEFAULT_SQS_REPORT_COLUMNS, mandatory_columns=None)
staticmethod
¶
Reduce dataframes to shared columns, optionally subsampling columns.
Mandatory columns are always included. A fixed random seed ensures reproducible column selection across evaluation components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reference
|
DataFrame
|
Training dataframe. |
required |
output
|
DataFrame
|
Synthetic dataframe. |
required |
test
|
DataFrame | None
|
Optional holdout dataframe. |
None
|
target_column_count
|
int
|
Maximum number of columns to keep. |
DEFAULT_SQS_REPORT_COLUMNS
|
mandatory_columns
|
list[str] | None
|
Columns that must be included regardless. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Tuple of (reference, output, test) dataframes restricted to the |
DataFrame
|
selected column set. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If reference and output share no columns. |
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_dataset.py
subsample_rows(reference, output, target_record_count=DEFAULT_RECORD_COUNT)
staticmethod
¶
Downsample both dataframes to at most target_record_count rows.
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_dataset.py
from_dataframes(reference, output, test=None, column_statistics=None, rows=DEFAULT_RECORD_COUNT, cols=DEFAULT_SQS_REPORT_COLUMNS, mandatory_columns=None, enable_sampling=True)
staticmethod
¶
Build an EvaluationDataset with optional column/row subsampling.
This is the primary constructor for evaluation. It validates inputs, optionally subsamples columns and rows, then delegates to the Pydantic model validator which computes per-column evaluation fields.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reference
|
DataFrame
|
Training dataframe. |
required |
output
|
DataFrame
|
Synthetic dataframe. |
required |
test
|
DataFrame | None
|
Optional holdout dataframe for text-similarity and privacy metrics. |
None
|
column_statistics
|
dict[str, ColumnStatistics] | None
|
Per-column PII entity metadata. |
None
|
rows
|
int
|
Target row count for subsampling. |
DEFAULT_RECORD_COUNT
|
cols
|
int
|
Target column count for subsampling. |
DEFAULT_SQS_REPORT_COLUMNS
|
mandatory_columns
|
list[str] | None
|
Columns to always include in subsampling. |
None
|
enable_sampling
|
bool
|
When |
True
|
Returns:
| Type | Description |
|---|---|
EvaluationDataset
|
A fully initialized |