Skip to content

evaluator

evaluator

Evaluator entry point for synthetic data evaluation.

Orchestrates metric computation and report assembly by delegating to MultimodalReport and collecting timing information.

Classes:

Name Description
Evaluator

Orchestrates evaluation of synthetic data against reference data.

Evaluator(config, generate_results, pii_replacer_time=None, column_statistics=None, train_df=None, test_df=None, workdir=None)

Orchestrates evaluation of synthetic data against reference data.

Computes quality and privacy metrics by delegating to MultimodalReport, which assembles individual evaluation components (distribution stability, correlation, PCA, text similarity, privacy scores, etc.).

Parameters:

Name Type Description Default
config SafeSynthesizerParameters

Pipeline configuration controlling which metrics are enabled.

required
generate_results GenerateJobResults | DataFrame

Synthetic output -- either a GenerateJobResults or a raw pd.DataFrame.

required
pii_replacer_time float | None

Wall-clock seconds spent on PII replacement, if any.

None
column_statistics dict[str, ColumnStatistics] | None

Per-column PII entity counts and transform metadata.

None
train_df DataFrame | None

Reference (training) dataframe.

None
test_df DataFrame | None

Holdout (test) dataframe used by text-similarity and privacy metrics.

None
workdir Workdir | None

Working directory for persisting artifacts.

None

Methods:

Name Description
evaluate

Run all configured evaluation components and store results.

Source code in src/nemo_safe_synthesizer/evaluation/evaluator.py
def __init__(
    self,
    config: SafeSynthesizerParameters,
    generate_results: GenerateJobResults | pd.DataFrame,
    pii_replacer_time: float | None = None,
    column_statistics: dict[str, ColumnStatistics] | None = None,
    train_df: pd.DataFrame | None = None,
    test_df: pd.DataFrame | None = None,
    workdir: Workdir | None = None,
):
    self.config = config
    self.generate_results = generate_results
    self.pii_replacer_time = pii_replacer_time
    self.column_statistics = column_statistics
    self.train_df = train_df
    self.test_df = test_df
    self.workdir = workdir

evaluate()

Run all configured evaluation components and store results.

Populates self.report with the completed MultimodalReport and self.evaluation_time with the elapsed wall-clock seconds.

Source code in src/nemo_safe_synthesizer/evaluation/evaluator.py
def evaluate(self):
    """Run all configured evaluation components and store results.

    Populates ``self.report`` with the completed ``MultimodalReport``
    and ``self.evaluation_time`` with the elapsed wall-clock seconds.
    """
    logger.info("Performing Evaluation.")
    evaluation_start = time.monotonic()
    output = self.generate_results if isinstance(self.generate_results, pd.DataFrame) else self.generate_results.df
    report = MultimodalReport.from_dataframes(
        reference=self.train_df,  # ty: ignore[invalid-argument-type]
        output=output,
        test=self.test_df,
        config=self.config,
        column_statistics=self.column_statistics,
    )
    evaluation_time_sec = time.monotonic() - evaluation_start
    logger.info("Evaluation complete.")
    self.evaluation_time = evaluation_time_sec
    self.report = report