Models¶

Anonymizer uses LLMs for entity detection, replacement, and rewriting. Models are configured via YAML and mapped to workflow roles.

Defaults¶

Set your API key for Anonymizer to use models hosted on build.nvidia.com.

export NVIDIA_API_KEY="your-nvidia-api-key"

Provider data handling

Anonymizer sends prompts and text snippets to your configured model provider. If data must stay in your trusted environment, use a trusted/local provider endpoint.

Alias	Model	Used by
`gliner-pii-detector`	`nvidia/gliner-pii`	Entity detection (NER)
`gpt-oss-120b`	`openai/gpt-oss-120b`	Detection validation & augmentation, replacement, replace evaluation, rewriting
`nemotron-30b-thinking`	`nvidia/nemotron-3-nano-30b-a3b`	Latent detection, rewrite evaluation, final judge

Each pipeline stage has a role mapped to one of these aliases. See the full role list in the default configs: detection.yaml, replace.yaml, rewrite.yaml.

Custom providers¶

Use model_providers to define named API endpoints for hosted models such as OpenAI or OpenRouter.

Set your API keys first:

export NVIDIA_API_KEY="your-nvidia-api-key"  # Used by the nvidia provider (build.nvidia.com)
export OPENAI_API_KEY="your-openai-api-key"
export OPENROUTER_API_KEY="your-openrouter-api-key"

YAMLPython

Define providers in a YAML file and pass the path to Anonymizer.

providers:
  - name: nvidia
    endpoint: https://integrate.api.nvidia.com/v1
  - name: openai
    endpoint: https://api.openai.com/v1
  - name: openrouter
    endpoint: https://openrouter.ai/api/v1

from anonymizer import Anonymizer

anonymizer = Anonymizer(model_providers="my_providers.yaml")

Construct ModelProvider objects directly in code.

import os
from anonymizer import Anonymizer, ModelProvider

providers = [
    ModelProvider(
        name="openai",
        endpoint="https://api.openai.com/v1",
        api_key=os.environ["OPENAI_API_KEY"],
    ),
    ModelProvider(
        name="openrouter",
        endpoint="https://openrouter.ai/api/v1",
        api_key=os.environ["OPENROUTER_API_KEY"],
    ),
]

anonymizer = Anonymizer(model_providers=providers)

After defining providers, reference them from your model configs as described below.

Custom models¶

Override specific roles by passing a unified YAML path to Anonymizer(model_configs=...). The provider field references a provider by name from the custom providers defined above.

# my_models.yaml
selected_models:
  detection:
    entity_detector: gliner-pii-detector
    entity_validator: gpt5
    entity_augmenter: gpt5
    latent_detector: claude-sonnet
  replace:
    replacement_generator: gpt5

model_configs:
  - alias: gliner-pii-detector
    model: nvidia/gliner-pii
    provider: nvidia
    inference_parameters:
      max_parallel_requests: 16
      timeout: 120
  - alias: gpt5
    model: gpt-5
    provider: openai
    inference_parameters:
      max_tokens: 4096
      temperature: 0.3
  - alias: claude-sonnet
    model: anthropic/claude-sonnet-4
    provider: openrouter
    inference_parameters:
      max_tokens: 8192
      temperature: 0.3

anonymizer = Anonymizer(
    model_configs="my_models.yaml",
    model_providers=providers,  # or "my_providers.yaml"
)

You can pass model_configs as either a YAML file path or a YAML string.

Roles you don't override keep their default alias selections, but those aliases must still exist in your model_configs pool.

Validate your config

Use anonymizer.validate_config(config) (or anonymizer validate from the CLI) after changing model configs to catch alias mismatches before processing data.

Validator pools¶

entity_validator accepts either a single alias (shown above) or a list of aliases. A list forms a validator pool with two jobs:

Load spreading. Chunked validation dispatches each chunk to the next alias in round-robin order, aggregating quota across equivalent endpoints when a single alias would hit the provider's rate limits (tokens-per-minute or requests-per-minute quotas).
Failover. If a chunk's assigned alias can't complete the call (an unrecoverable rate limit, a 5xx that didn't clear on retry, a malformed response), the same chunk is automatically retried against the other aliases in your pool before the row is given up on. A row is only dropped when every alias in the pool has failed for the same chunk. Single-alias pools have nothing to fall back to, so they behave the same as not using a pool.

selected_models:
  detection:
    entity_detector: gliner-pii-detector
    entity_validator:
      - gpt5-primary
      - gpt5-secondary
    entity_augmenter: gpt5-primary
    latent_detector: claude-sonnet

Every alias in the pool must also appear in model_configs; anonymizer validate flags unknown aliases by index. A scalar value remains valid and is equivalent to a one-element list.

max_parallel_requests is enforced per alias

A pool with N aliases effectively allows up to sum(max_parallel_requests for alias in pool) concurrent validator calls per row when chunks exist. Budget your provider rate limits accordingly — the whole point of pooling is to multiply in-flight requests, but the multiplication is real.

Pool aliases should target equivalent models (same model family, similar quality). Mixing heterogeneous models produces inconsistent validation across chunks in the same row and is almost always a misconfiguration.

Choosing custom models¶

For Anonymizer, the best overall leaderboard model is not always the best default for every role. Some roles are simple classification or constrained JSON generation tasks, while others require deeper reasoning about privacy risk, long-context rewriting, and leakage repair (see Risk tolerance).

Use benchmarks as signals for role fit, not as a single global ranking.

Most useful benchmark signals¶

Benchmark	What it predicts well in Anonymizer
`IFBench`	Following detailed instructions, producing constrained outputs, and obeying prompt rules.
`AA-Omniscience Accuracy`	Recovering the right facts without dropping important information.
`AA-Omniscience Non-Hallucination`	Avoiding invented entities, facts, or unsupported claims.
`AA-LCR`	Handling long prompts with tagged text, domain guidance, replacement maps, and evaluation context.
`Humanity's Last Exam` / `GPQA Diamond`	General reasoning depth for privacy-sensitive planning and rewriting.

Also consider operational constraints like latency, output speed, and verbosity, since they drive cost and practical throughput.

Practical guidance¶

Use your strongest models for latent_detector, disposition_analyzer, rewriter, and repairer.
Use mid-tier models for entity_augmenter, meaning_extractor, and replacement_generator.
Use smaller or faster models for entity_validator, domain_classifier, qa_generator, and evaluator.
Do not optimize every role for peak leaderboard rank. Optimize the hard-to-recover privacy and rewrite steps for quality. Optimize bounded steps for reliability per token.