rewrite
rewrite
¶
Classes:
| Name | Description |
|---|---|
RiskTolerance |
Risk tolerance presets for leakage mass thresholds. |
PrivacyGoal |
Structured privacy and utility goal for rewrite mode. |
EvaluationCriteria |
Criteria for privacy leakage evaluation and repair. |
RiskTolerance
¶
Bases: str, Enum
Risk tolerance presets for leakage mass thresholds.
Each preset bundles a coherent set of repair and review thresholds:
- minimal — Tight leakage threshold (0.6), flags for review aggressively. Good for medical, legal, and financial data.
- low — Default. Moderate leakage threshold (1.0). Good for most privacy-sensitive data.
- moderate — Relaxed leakage threshold (1.5), lower review bar.
- high — High leakage threshold (2.0), does not auto-repair individual high-sensitivity leaks.
PrivacyGoal
pydantic-model
¶
Bases: BaseModel
Structured privacy and utility goal for rewrite mode.
Fields:
Validators:
protect
pydantic-field
¶
What to protect (e.g. direct identifiers, quasi-identifiers).
preserve
pydantic-field
¶
What to preserve (e.g. utility, semantic meaning).
to_prompt_string()
¶
Serialize goal into prompt-ready text.
Source code in src/anonymizer/config/rewrite.py
def to_prompt_string(self) -> str:
"""Serialize goal into prompt-ready text."""
return f"PROTECT: {self.protect}\nPRESERVE: {self.preserve}"
EvaluationCriteria
pydantic-model
¶
Bases: BaseModel
Criteria for privacy leakage evaluation and repair.
risk_tolerance controls the leakage threshold that triggers repair,
whether individual high-sensitivity leaks trigger repair, and the
thresholds for flagging records for human review. See RiskTolerance
for preset descriptions.
max_repair_iterations caps how many repair rounds are attempted
(each round = one LLM call per failing row). Set to 0 to disable repair
while still producing evaluation metrics.
Fields:
risk_tolerance = RiskTolerance.low
pydantic-field
¶
Preset controlling repair and review thresholds.
max_repair_iterations = 3
pydantic-field
¶
Maximum repair rounds. Set to 0 to disable repair.
repair_threshold
property
¶
Leakage mass above which a row is sent for repair.
repair_any_high_leak
property
¶
Whether any single high-sensitivity leak triggers repair.
flag_utility_below
property
¶
Flag for human review if utility score is below this.
flag_leakage_above
property
¶
Flag for human review if leakage mass exceeds this.
sensitivity_weights
property
¶
Weights for high/medium/low sensitivity levels in leakage mass computation.