evaluation_field
evaluation_field
¶
Classes:
| Name | Description |
|---|---|
EvaluationField |
Per-column evaluation metadata and distribution scores. |
EvaluationField
pydantic-model
¶
Bases: BaseModel
Per-column evaluation metadata and distribution scores.
Fields:
-
name(str) -
reference_field_features(FieldFeatures) -
output_field_features(FieldFeatures) -
reference_distribution(dict | None) -
output_distribution(dict | None) -
distribution_distance(float | None) -
distribution_stability(EvaluationScore | None) -
column_statistics(ColumnStatistics | None)
name
pydantic-field
¶
Column name from the original dataframe.
reference_field_features
pydantic-field
¶
Field type and descriptive statistics for the reference column.
output_field_features
pydantic-field
¶
Field type and descriptive statistics for the output column.
reference_distribution
pydantic-field
¶
Binned distribution dict for the reference column.
output_distribution
pydantic-field
¶
Binned distribution dict for the output column.
distribution_distance
pydantic-field
¶
Jensen-Shannon distance between the two distributions.
distribution_stability
pydantic-field
¶
Graded score derived from the distribution distance.
column_statistics
pydantic-field
¶
PII entity counts and transform metadata, if available.
from_series(name, reference, output, column_statistics=None)
staticmethod
¶
Build an EvaluationField from paired reference/output column.
Normally called internally by EvaluationDataset; direct use is
rarely needed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Column name. |
required |
reference
|
Series
|
Reference column data. |
required |
output
|
Series
|
Output (synthetic) column data. |
required |
column_statistics
|
ColumnStatistics | None
|
PII entity metadata to attach, if available. |
None
|
Returns:
| Type | Description |
|---|---|
EvaluationField
|
A fully populated |
EvaluationField
|
and stability score. |
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_field.py
get_average_divergence(fields)
staticmethod
¶
Compute the mean Jensen-Shannon divergence across a list of fields.
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_field.py
text_js_scaling_func(average_divergence)
staticmethod
¶
Scale average JS divergence for text data using a linear equation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
average_divergence
|
float
|
Mean JS divergence across text fields. |
required |
Returns:
| Type | Description |
|---|---|
float
|
A score in the range |
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_field.py
tabular_js_scaling_func(average_divergence)
staticmethod
¶
Scale average JS divergence for tabular data using a quadratic equation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
average_divergence
|
float
|
Mean JS divergence across tabular fields. |
required |
Returns:
| Type | Description |
|---|---|
float
|
A score in the range |
Source code in src/nemo_safe_synthesizer/evaluation/data_model/evaluation_field.py
get_field_distribution_stability(average_divergence, js_scaling_func=None)
staticmethod
¶
Convert an average JS divergence into a graded EvaluationScore.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
average_divergence
|
float
|
Mean JS divergence across fields. |
required |
js_scaling_func
|
Callable[[float], float] | None
|
Scaling function mapping divergence to a 0--10
score. Defaults to |
None
|
Returns:
| Type | Description |
|---|---|
EvaluationScore
|
A finalized |