Guardrail Catalog#
The NeMo Guardrails library ships with a catalog of pre-built guardrails that you can activate out of the box. These guardrails span the most common safety and security concerns in LLM-powered applications from blocking harmful content and detecting jailbreak attempts to masking personally identifiable information and grounding responses in evidence.
Each guardrail is implemented as a configurable rail flow that you add to the input, output, or retrieval section of your config.yml. You can use NVIDIA-trained safety models, open-source community models, LLM self-check prompts, or third-party managed APIs, and combine multiple approaches for defense in depth.
Browse the catalog below to find the guardrail that fits your use case.
Reference for pre-built content safety guardrails for protecting against violence, criminal activity, hate speech, sexually explicit content, and similar areas.
Reference for jailbreak protection guardrails that help prevent adversarial attempts from bypassing safety measures.
Reference for topic control guardrails that ensure conversations stay within predefined subject boundaries.
Reference for PII detection guardrails that protect user privacy by detecting and masking sensitive data.
Reference for agentic security guardrails that protect LLM-based agents using tools and interacting with external systems.
Reference for fact-checking and hallucination detection guardrails that ensure LLM output is grounded in evidence.
Reference for LLM self-checking guardrails that prompt the LLM to perform input checking, output checking, or fact-checking.
Reference for third-party API integrations that connect with managed services for guardrail use cases.