nemo_pii
nemo_pii
¶
Classes:
| Name | Description |
|---|---|
ColumnClassification |
Classification and detected-entity info for a column prior to transform. |
NemoPII |
PII replacement over DataFrames via classification, NER, and configurable transforms. |
Functions:
| Name | Description |
|---|---|
classify_config_from_params |
Build classification and NER config from PII replacer config. |
build_entity_extractor |
Build a composite entity extractor from classification config. |
get_column_classifier |
Return a column classifier backed by the NSS inference endpoint ( |
Attributes:
| Name | Type | Description |
|---|---|---|
ACCOUNTING_FUNCTIONS |
Transform function names tracked for report accounting (which functions were used per column). |
ACCOUNTING_FUNCTIONS = ['re', 'fake', 'random', 'hash', 'normalize', 'partial_mask', 'tld', 'date_shift', 'date_time_shift', 'date_format', 'date_time_format', 'detect_entities', 'redact_entities', 'label_entities', 'hash_entities', 'fake_entities', 'drop']
module-attribute
¶
Transform function names tracked for report accounting (which functions were used per column).
ColumnClassification
pydantic-model
¶
Bases: BaseModel
Classification and detected-entity info for a column prior to transform.
When entity is None (e.g. unclassified), entity_count
is None and entity_values is an empty list.
Fields:
-
field_name(str) -
column_type(str | None) -
entity(str | None) -
entity_count(int | None) -
entity_values(list[Any])
field_name
pydantic-field
¶
Name of the field/column.
column_type
pydantic-field
¶
Detected column type (e.g. text, numeric).
entity
pydantic-field
¶
Detected entity type (e.g. email, phone), or None if none.
entity_count = None
pydantic-field
¶
Number of non-empty values in this field. None if no entity detected.
entity_values
pydantic-field
¶
Unique values for this field. Empty if no entity detected.
NemoPII(config=None)
¶
Bases: object
PII replacement over DataFrames via classification, NER, and configurable transforms.
Call classify_df to get column classifications, then transform_df to replace
PII. The result and per-column statistics are on result after transform_df.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
PiiReplacerConfig | None
|
PII replacer config. If |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
result |
TransformResult
|
Result of the last |
Example
nemo_pii = NemoPII() nemo_pii.transform_df(df) result = nemo_pii.result print(result.transformed_df) print(result.column_statistics)
Methods:
| Name | Description |
|---|---|
classify_df |
Classify each column (type and entity) using config and optional LLM classifier. |
transform_df |
Replace PII in the DataFrame and set |
Source code in src/nemo_safe_synthesizer/pii_replacer/nemo_pii.py
classify_df(df)
¶
Classify each column (type and entity) using config and optional LLM classifier.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame to classify. |
required |
Returns:
| Type | Description |
|---|---|
list[ColumnClassification]
|
List of |
list[ColumnClassification]
|
entity, entity count, and unique entity values. |
Source code in src/nemo_safe_synthesizer/pii_replacer/nemo_pii.py
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 | |
transform_df(df, classifications=None)
¶
Replace PII in the DataFrame and set self.result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame to transform. |
required |
classifications
|
list[ColumnClassification] | None
|
Optional precomputed classifications. If |
None
|
Source code in src/nemo_safe_synthesizer/pii_replacer/nemo_pii.py
classify_config_from_params(config)
¶
Build classification and NER config from PII replacer config.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
PiiReplacerConfig
|
PII replacer config containing globals for classify and NER. |
required |
Returns:
| Type | Description |
|---|---|
ClassifyConfig
|
|
Source code in src/nemo_safe_synthesizer/pii_replacer/nemo_pii.py
build_entity_extractor(clsfy_cfg)
¶
Build a composite entity extractor from classification config.
Source code in src/nemo_safe_synthesizer/pii_replacer/nemo_pii.py
get_column_classifier()
¶
Return a column classifier backed by the NSS inference endpoint (NSS_INFERENCE_ENDPOINT, NSS_INFERENCE_KEY).