GLiNER Integration#
GLiNER is a generalist and lightweight model for named entity recognition. NVIDIA GLiNER-PII is an adaptation of this base model that can detect a wide range of entity types, including comprehensive PII (Personally Identifiable Information) categories. This integration enables the NeMo Guardrails library to use a GLiNER-compatible server for PII detection and masking in input, output, and retrieval flows.
Server Setup#
Deploy a GLiNER-compatible server.
Refer to the example implementation at examples/deployment/gliner_server/:
cd examples/deployment/gliner_server
# Install with uv (recommended)
uv sync
# Start the server (uses nvidia/gliner-PII model by default)
uv run gliner-server --host 0.0.0.0 --port 1235
# Or install with pip
pip install -e .
gliner-server --host 0.0.0.0 --port 1235
Guardrails Configuration#
Update your config.yml file to include the GLiNER settings:
PII detection config
The detection flow blocks the input, output, and retrieval text if it detects PII.
rails:
config:
gliner:
server_endpoint: http://localhost:1235/v1/extract
threshold: 0.5 # Confidence threshold (0.0 to 1.0)
input:
entities: # If no entity is specified, all default PII categories are detected
- email
- phone_number
- ssn
- first_name
- last_name
output:
entities:
- email
- phone_number
- credit_debit_card
input:
flows:
- gliner detect pii on input
output:
flows:
- gliner detect pii on output
PII masking config
The masking flow replaces detected PII with labels.
For example, Hi John, my email is john@example.com becomes Hi [FIRST_NAME], my email is [EMAIL].
rails:
config:
gliner:
server_endpoint: http://localhost:1235/v1/extract
input:
entities:
- email
- first_name
- last_name
output:
entities:
- email
- first_name
- last_name
input:
flows:
- gliner mask pii on input
output:
flows:
- gliner mask pii on output
API Specification#
The GLiNER integration expects a server that implements the following API:
POST /v1/extract#
Extract entities from text.
Request Body:
Field |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
string |
Yes |
- |
The text to analyze for entities |
|
array[string] |
No |
Server default |
List of entity labels to detect |
|
float |
No |
0.5 |
Confidence threshold (0.0 to 1.0) |
|
int |
No |
384 |
Length of text chunks for processing |
|
int |
No |
128 |
Overlap between chunks |
|
bool |
No |
false |
Whether to use flat NER mode |
Example Request:
{
"text": "Hello, my name is John and my email is john@example.com",
"labels": ["email", "first_name"],
"threshold": 0.5
}
Response Body:
Field |
Type |
Description |
|---|---|---|
|
array[EntitySpan] |
List of detected entities |
|
int |
Total count of entities found |
|
string |
Text with entities tagged as |
EntitySpan Object:
Field |
Type |
Description |
|---|---|---|
|
string |
The detected entity text |
|
string |
The entity label/type |
|
int |
Start character index (inclusive) |
|
int |
End character index (exclusive) |
|
float |
Confidence score |
Example Response:
{
"entities": [
{
"value": "John",
"suggested_label": "first_name",
"start_position": 18,
"end_position": 22,
"score": 0.95
},
{
"value": "john@example.com",
"suggested_label": "email",
"start_position": 40,
"end_position": 56,
"score": 0.98
}
],
"total_entities": 2,
"tagged_text": "Hello, my name is [John](first_name) and my email is [john@example.com](email)"
}
Supported Entity Types#
The example GLiNER server (using the nvidia/gliner-PII model) supports a comprehensive list of PII categories:
Category |
Entity Types |
|---|---|
Personal Identifiers |
|
Contact Information |
|
Financial |
|
Technical |
|
Identification |
|
Sensitive Attributes |
|
Configuration Options#
Option |
Default |
Description |
|---|---|---|
|
|
GLiNER server endpoint |
|
|
Confidence threshold for entity detection (0.0 to 1.0) |
|
|
Length of text chunks for processing |
|
|
Overlap between chunks |
|
|
Whether to use flat NER mode |
Usage#
Once configured, the GLiNER integration can automatically:
Detect or mask PII in user inputs before the LLM processes them.
Detect or mask PII in LLM outputs before sending them back to the user.
Detect or mask PII in retrieved chunks before sending them to the LLM.
Example Deployment#
The examples/deployment/gliner_server/ directory provides an example GLiNER server implementation.
This implementation:
Uses the NVIDIA GLiNER-PII model for comprehensive PII detection.
Supports GPU acceleration (CUDA, MPS on Apple Silicon).
Implements text chunking with overlap for long documents.
Provides entity deduplication.
Structured as a proper Python package with
src/layout.CLI entry point (
gliner-server) for easy startup.Unit tests for PII utility functions (no server required).
Integration test script for end-to-end validation.
Refer to the deployment README for detailed instructions.
Testing#
The GLiNER integration tests in tests/test_gliner.py use mocked API responses, so they don’t require a running server.
To run them:
pytest tests/test_gliner.py -v
The example server package also includes unit tests for the PII utility functions:
cd examples/deployment/gliner_server
uv run pytest tests/ -v
For integration testing with a running server, use the provided script:
cd examples/deployment/gliner_server
./test_integration.sh
Summary#
Ensure a GLiNER-compatible server is running and accessible from your NeMo Guardrails application environment.
You can use the provided example server or implement your own server following the API specification.
For production deployments, consider containerizing the server.
For more information on GLiNER, refer to the GLiNER GitHub repository.