anonymizer_config
anonymizer_config
¶
Classes:
| Name | Description |
|---|---|
AnonymizerInput |
Input source definition for the anonymizer pipeline. |
Detect |
Configuration for the entity detection stage. |
Rewrite |
Configuration for rewrite-mode execution. |
AnonymizerConfig |
Primary user-facing config for anonymization behavior. |
Functions:
| Name | Description |
|---|---|
is_remote_input_source |
Return True when the input source is an HTTP(S) URL. |
infer_input_source_suffix |
Infer the lowercase file suffix from a local path or remote URL path. |
AnonymizerInput
pydantic-model
¶
Bases: BaseModel
Input source definition for the anonymizer pipeline.
Format is inferred from the file extension of a local path or HTTP(S) URL.
Fields:
-
source(str) -
text_column(str) -
id_column(str | None) -
data_summary(str | None)
Validators:
-
validate_source_path→source
source
pydantic-field
¶
Local path or HTTP(S) URL for a .csv or .parquet input file.
text_column = 'text'
pydantic-field
¶
Column containing the text to anonymize.
id_column = None
pydantic-field
¶
Optional column to use as record identifier.
data_summary = None
pydantic-field
¶
Short description of the data. Improves LLM detection accuracy.
Detect
pydantic-model
¶
Bases: BaseModel
Configuration for the entity detection stage.
Fields:
-
entity_labels(list[str] | None) -
gliner_threshold(float)
Validators:
-
validate_entity_labels→entity_labels
Rewrite
pydantic-model
¶
Bases: BaseModel
Configuration for rewrite-mode execution.
Fields:
-
privacy_goal(PrivacyGoal | None) -
instructions(str | None) -
risk_tolerance(RiskTolerance) -
max_repair_iterations(int)
Validators:
-
populate_default_privacy_goal
privacy_goal = None
pydantic-field
¶
Structured privacy goal. Auto-populated with defaults if not provided.
instructions = None
pydantic-field
¶
Additional instructions for the rewrite LLM.
risk_tolerance = RiskTolerance.low
pydantic-field
¶
Preset controlling repair thresholds and review flagging.
max_repair_iterations = 2
pydantic-field
¶
Maximum repair rounds. Set to 0 to disable repair.
evaluation
property
¶
Internal: construct EvaluationCriteria for the engine.
AnonymizerConfig
pydantic-model
¶
Bases: BaseModel
Primary user-facing config for anonymization behavior.
Fields:
Validators:
-
validate_exactly_one_mode
is_remote_input_source(value)
¶
Return True when the input source is an HTTP(S) URL.
Source code in src/anonymizer/config/anonymizer_config.py
def is_remote_input_source(value: str) -> bool:
"""Return True when the input source is an HTTP(S) URL."""
parsed = urlparse(value)
return parsed.scheme in {"http", "https"}
infer_input_source_suffix(value)
¶
Infer the lowercase file suffix from a local path or remote URL path.
Source code in src/anonymizer/config/anonymizer_config.py
def infer_input_source_suffix(value: str) -> str:
"""Infer the lowercase file suffix from a local path or remote URL path."""
if is_remote_input_source(value):
return Path(urlparse(value).path).suffix.lower()
return Path(value).suffix.lower()