PII Detection#

Personally Identifiable Information (PII) detection helps protect user privacy by detecting and masking sensitive data in user inputs, LLM outputs, and retrieved content.

GLiNER-based PII Detection#

The NeMo Guardrails library supports various PII detection models.

To activate the PII detection, you need to set up the server endpoint of the PII detection model in your config.yml and specify the entities that you want to detect and mask. For example, the following configuration uses the GLiNER PII detection model, where the GLiNER server endpoint is http://localhost:1235/v1/extract:

PII detection config#

The detection flow blocks the input, output, and retrieval text if it detects PII.

rails:
  config:
    gliner:
      server_endpoint: http://localhost:1235/v1/extract
      threshold: 0.5  # Confidence threshold (0.0 to 1.0)
      input:
        entities:  # If no entity is specified, all default PII categories are detected
          - email
          - phone_number
          - ssn
          - first_name
          - last_name
      output:
        entities:
          - email
          - phone_number
          - credit_debit_card
  input:
    flows:
      - gliner detect pii on input
  output:
    flows:
      - gliner detect pii on output

PII masking config#

The masking flow replaces detected PII with labels. For example, Hi John, my email is john@example.com becomes Hi [FIRST_NAME], my email is [EMAIL].

rails:
  config:
    gliner:
      server_endpoint: http://localhost:1235/v1/extract
      input:
        entities:
          - email
          - first_name
          - last_name
      output:
        entities:
          - email
          - first_name
          - last_name
  input:
    flows:
      - gliner mask pii on input
  output:
    flows:
      - gliner mask pii on output

For a detailed example, please refer to the GLiNER Integration page.

Presidio-based Sensitive Data Detection#

The NeMo Guardrails library supports detecting sensitive data out-of-the-box using Presidio, which provides fast identification and anonymization modules for private entities in text such as credit card numbers, names, locations, social security numbers, bitcoin wallets, US phone numbers, financial data and more. You can detect sensitive data on user input, bot output, or the relevant chunks retrieved from the knowledge base.

To activate a sensitive data detection input rail, you have to configure the entities that you want to detect:

rails:
  config:
    sensitive_data_detection:
      input:
        entities:
          - PERSON
          - EMAIL_ADDRESS
          - ...

Example usage#

rails:
  input:
    flows:
      - mask sensitive data on input
  output:
    flows:
      - mask sensitive data on output
  retrieval:
    flows:
      - mask sensitive data on retrieval

For more details, check out the Presidio Integration page.

Private AI PII Detection#

The NeMo Guardrails library supports using Private AI API for PII detection and masking input, output and retrieval flows.

To activate the PII detection or masking, you need specify server_endpoint, and the entities that you want to detect or mask. You’ll also need to set the PAI_API_KEY environment variable if you’re using the Private AI cloud API.

rails:
  config:
    privateai:
      server_endpoint: http://your-privateai-api-endpoint/process/text  # Replace this with your Private AI process text endpoint
      input:
        entities:  # If no entity is specified here, all supported entities will be detected by default.
          - NAME_FAMILY
          - EMAIL_ADDRESS
          ...
      output:
        entities:
          - NAME_FAMILY
          - EMAIL_ADDRESS
          ...

Example usage#

PII detection

rails:
  input:
    flows:
      - detect pii on input
  output:
    flows:
      - detect pii on output
  retrieval:
    flows:
      - detect pii on retrieval

For more details, check out the Private AI Integration page.