Release Notes#

The following sections summarize and highlight the changes for each release. For a complete record of changes in a release, refer to the CHANGELOG.md in the GitHub repository.

0.22.0#

Key Features#

LangChain is now optional. pip install nemoguardrails no longer pulls LangChain or any provider-specific langchain-* packages. The NVIDIA NeMo Guardrails library ships with a built-in client that talks to OpenAI-compatible endpoints directly over httpx. Engines whose API isn’t OpenAI-compatible (Anthropic, Cohere, Vertex AI, Google Generative AI, in-process Hugging Face, TensorRT-LLM, and others) keep working through LangChain when you opt in with NEMOGUARDRAILS_LLM_FRAMEWORK=langchain and install the matching provider package. Most 0.21 configurations keep working unchanged; some shapes need a YAML rewrite. For recipes, refer to Migrating to v0.22.0, the Supported LLMs matrix, and Model Configuration.
OpenAI-compatible service support is improved in the default framework. The default framework now supports OpenAI-compatible providers directly, includes native Azure OpenAI support through engine: azure and engine: azure_openai, and documents how to migrate provider-specific LangChain parameters to the new base_url-based configuration shape. For more information, refer to Migrating to v0.22.0, Model Configuration, Configuration Reference, and Using Docker.
IORails adds streaming support, reasoning-model support, and speculative generation support. The optimized input and output rails engine now supports streaming output rails, stream_async() integration in chat and server flows, non-streaming and streaming reasoning-model responses, and speculative generation for non-streaming generate_async() calls. For more information, refer to Parallel Rails, Streaming, and Speculative Generation.
IORails adds OpenTelemetry observability with logging, tracing, and metrics support. The documentation covers OTLP setup, Prometheus client installation, request-level and token-level metrics, and the recommended Guardrails entry point for the optimized input and output rails engine. For more information, refer to Observability, OpenTelemetry Logs, OpenTelemetry Tracing, OpenTelemetry Metrics, Enable Metrics, and the Metrics Reference.
Anonymous usage reporting is documented with clear privacy boundaries and opt-out controls. The telemetry reference explains what fields are collected, what data is excluded, how local audit files work, and how to opt out with NEMO_GUARDRAILS_NO_USAGE_STATS=1, DO_NOT_TRACK=1, or the ~/.config/nemoguardrails/do_not_track file. For more information, refer to Telemetry.

Breaking Changes#

Moved AsyncWorkQueue from the top-level Guardrails object to IORails. This removes buffering for non-streaming LLMRails requests when you use the top-level Guardrails object. This change only affects existing implementations that set NEMO_GUARDRAILS_IORAILS_ENGINE=1 or instantiate Guardrails directly.

Enhancements#

The GLiNER PII connector documentation and notebook are updated for the new GLiNER PII NIM. The examples cover both remote and local deployment modes and API key configuration for the connector. For more information, refer to GLiNER and PII Detection.
Public extension points for LLM integration. Two new protocols, LLMModel and LLMFramework in nemoguardrails.types, let you plug in a custom backend or a whole alternative framework without touching internals. For more information, refer to Custom LLM Models and Custom LLM Frameworks.
Public testing surface. The nemoguardrails.testing module exposes FakeLLMModel, TestChat, and pytest fixtures for writing tests against a guardrails configuration without calling a real model.

Documentation and Behavior Fixes#

Fixed the example query and expected output in the Guardrails Agent Middleware integration guide so the example matches the configured blocked response behavior. For more information, refer to Guardrails Agent Middleware.
A warning about a missing main LLM is now emitted only when generation is actually attempted and the generation path needs the main LLM. Check-only configurations no longer emit the warning during initialization. For more information, refer to Check Messages.
Fixed issues in the Colang 1.0 Hello World tutorial and companion notebook.

Release Notes#

0.22.0#

Key Features#

Breaking Changes#

Enhancements#

Documentation and Behavior Fixes#

Previous Release Notes#