replace_strategies
replace_strategies
¶
Classes:
| Name | Description |
|---|---|
ReplaceMethodBase |
Shared configuration for all replacement methods. |
Substitute |
Replace entities with LLM-generated synthetic values. |
Redact |
Replace each entity with a configurable redaction template. |
Annotate |
Tag each entity with a readable label token. |
Hash |
Replace each entity with a deterministic hash token. |
ReplaceMethodBase
pydantic-model
¶
Bases: BaseModel
Shared configuration for all replacement methods.
Substitute
pydantic-model
¶
Bases: ReplaceMethodBase
Replace entities with LLM-generated synthetic values.
Fields:
-
instructions(str | None)
instructions = None
pydantic-field
¶
Additional instructions for the LLM replacement generator.
Redact
pydantic-model
¶
Bases: ReplaceMethodBase
Replace each entity with a configurable redaction template.
Fields:
-
format_template(str) -
normalize_label(bool)
Validators:
-
validate_format_template→format_template
format_template = '[REDACTED_{label}]'
pydantic-field
¶
Template with optional {label} placeholder.
normalize_label = True
pydantic-field
¶
Uppercase and clean label before substitution.
replace(text, label)
¶
Apply the redaction template to a single entity occurrence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Original entity text (e.g. |
required |
label
|
str
|
Entity label (e.g. |
required |
Source code in src/anonymizer/config/replace_strategies.py
def replace(self, text: str, label: str) -> str:
"""Apply the redaction template to a single entity occurrence.
Args:
text: Original entity text (e.g. ``"Alice"``).
label: Entity label (e.g. ``"first_name"``).
"""
normalized_label = _format_label_for_redaction(label) if self.normalize_label else label
return self._render_template(
template=self.format_template,
text=text,
label=normalized_label,
)
Annotate
pydantic-model
¶
Bases: ReplaceMethodBase
Tag each entity with a readable label token.
Fields:
-
format_template(str)
Validators:
-
validate_format_template→format_template
format_template = '<{text}, {label}>'
pydantic-field
¶
Template with {text} and {label} placeholders.
replace(text, label)
¶
Apply the annotation template to a single entity occurrence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Original entity text (e.g. |
required |
label
|
str
|
Entity label (e.g. |
required |
Source code in src/anonymizer/config/replace_strategies.py
def replace(self, text: str, label: str) -> str:
"""Apply the annotation template to a single entity occurrence.
Args:
text: Original entity text (e.g. ``"Alice"``).
label: Entity label (e.g. ``"first_name"``).
"""
return self._render_template(
template=self.format_template,
text=text,
label=label,
)
Hash
pydantic-model
¶
Bases: ReplaceMethodBase
Replace each entity with a deterministic hash token.
Fields:
-
algorithm(Literal['sha256', 'sha1', 'md5']) -
digest_length(int) -
format_template(str)
Validators:
-
validate_format_template→format_template
algorithm = 'sha256'
pydantic-field
¶
Hash algorithm.
digest_length = 12
pydantic-field
¶
Number of hex characters to keep from the hash digest.
format_template = '<HASH_{label}_{digest}>'
pydantic-field
¶
Template with {digest} required and optional {label}.
replace(text, label)
¶
Apply the hash template to a single entity occurrence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Original entity text (e.g. |
required |
label
|
str
|
Entity label (e.g. |
required |
Source code in src/anonymizer/config/replace_strategies.py
def replace(self, text: str, label: str) -> str:
"""Apply the hash template to a single entity occurrence.
Args:
text: Original entity text (e.g. ``"Alice"``).
label: Entity label (e.g. ``"first_name"``).
"""
digest = hashlib.new(self.algorithm, text.encode("utf-8")).hexdigest()[: self.digest_length]
return self._render_template(
template=self.format_template,
text=text,
label=label.upper(),
digest=digest,
)