Skip to content

metadata

metadata

Classes:

Name Description
EntityReport
TypeReport

Information about scalar types of values within a given field.

FieldMetadataReport
EntitySummaryReport

Aggregate entity metrics by label.

DatasetMetadataReport

Represents report with model_metadata about the dataset.

Functions:

Name Description
convert_to_report

Converts internal model_metadata object to the report.

EntityReport pydantic-model

Bases: ReportBaseModel

Config:

  • extra: forbid

Fields:

  • label (str)
  • count (int)
  • approx_distinct_count (int)
  • f_ratio (float)
  • sources (list[str])

sources pydantic-field

A list of unique sources that contributed predictions to the entity summary.

TypeReport pydantic-model

Bases: ReportBaseModel

Information about scalar types of values within a given field. See :func:common.records.base.get_type_as_string

Fields:

type pydantic-field

Name of the type.

count pydantic-field

Number of times value of that type has appeared.

FieldMetadataReport pydantic-model

Bases: ReportBaseModel

Fields:

name pydantic-field

Name of the field. For data formats that support nesting, it's a dot delimited list of parents. E.g. "user.firstName"

count pydantic-field

Number of records that had value for this field.

approx_distinct_count pydantic-field

Approximate number of distinct values for this field.

missing_count pydantic-field

Number of records that didnt' have value for this field.

labels pydantic-field

Field-level labels for this field.

attributes pydantic-field

Attributes for this field.

entities pydantic-field

Granular information about entities detected in field's values.

types pydantic-field

Scalar types of values in this field.

EntitySummaryReport pydantic-model

Bases: ReportBaseModel

Aggregate entity metrics by label.

Fields:

label pydantic-field

Name of the entity or label.

fields pydantic-field

List of fields the entity was seen in.

count pydantic-field

Total number of entity occurrences in the dataset by score.

approx_distinct_count pydantic-field

Approximate number of unique entity values in the dataset.

sources pydantic-field

A list of unique sources that contributed predictions to the entity summary.

DatasetMetadataReport pydantic-model

Bases: ReportBaseModel

Represents report with model_metadata about the dataset.

Fields:

record_count pydantic-field

Number of records that were used to calculate model_metadata.

fields pydantic-field

Report about each field in the dataset.

entities pydantic-field

Aggregate entity metrics by score by label.

convert_to_report(metadata)

Converts internal model_metadata object to the report.

Source code in src/nemo_safe_synthesizer/pii_replacer/ner/report/metadata.py
def convert_to_report(metadata: DatasetMetadata) -> DatasetMetadataReport:
    """Converts internal model_metadata object to the report."""
    fields_report = [_convert_field(field) for field in metadata.data.fields]
    entity_summary_report = [EntitySummaryReport(**e.dict()) for e in metadata.data.entities]

    return DatasetMetadataReport(
        record_count=metadata.project_record_count,
        fields=fields_report,
        entities=entity_summary_report,
    )