Guardrail Catalog#

The NeMo Guardrails library ships with a catalog of pre-built guardrails that you can activate out of the box. These guardrails span the most common safety and security concerns in LLM-powered applications from blocking harmful content and detecting jailbreak attempts to masking personally identifiable information and grounding responses in evidence.

Each guardrail is implemented as a configurable rail flow that you add to the input, output, or retrieval section of your config.yml. You can use NVIDIA-trained safety models, open-source community models, LLM self-check prompts, or third-party managed APIs, and combine multiple approaches for defense in depth.

Browse the catalog below to find the guardrail that fits your use case.

Content Safety

Reference for pre-built content safety guardrails for protecting against violence, criminal activity, hate speech, sexually explicit content, and similar areas.

Content Safety
Jailbreak Protection

Reference for jailbreak protection guardrails that help prevent adversarial attempts from bypassing safety measures.

Jailbreak Protection
Topic Control

Reference for topic control guardrails that ensure conversations stay within predefined subject boundaries.

Topic Control
PII Detection

Reference for PII detection guardrails that protect user privacy by detecting and masking sensitive data.

PII Detection
Agentic Security

Reference for agentic security guardrails that protect LLM-based agents using tools and interacting with external systems.

Agentic Security
Hallucinations & Fact-Checking

Reference for fact-checking and hallucination detection guardrails that ensure LLM output is grounded in evidence.

Hallucinations & Fact-Checking
LLM Self-Check

Reference for LLM self-checking guardrails that prompt the LLM to perform input checking, output checking, or fact-checking.

LLM Self-Check
Third-Party APIs

Reference for third-party API integrations that connect with managed services for guardrail use cases.

Third-Party APIs