Release Notes#
The following sections summarize and highlight the changes for each release. For a complete record of changes in a release, refer to the CHANGELOG.md in the GitHub repository.
0.20.0#
Key Features#
Added support for multilingual content safety models such as NVIDIA Nemotron Safety Guard 8B v3. This feature uses the fast-langdetect package to detect the user’s input language and return refusal messages in the appropriate language. To use this feature, install the NeMo Guardrails library with the
multilingualextra.pip install nemoguardrails[multilingual]
Added support for configuring custom refusal messages per language to complement multilingual content safety models. You can enable multilingual refusal messages and specify custom refusal messages in the
rails.config.content_safetysection of theconfig.ymlfile.rails: config: content_safety: multilingual: enabled: true refusal_messages: en: "Sorry, I cannot help with that request." es: "Lo siento, no puedo ayudar con esa solicitud." zh: "抱歉,我无法处理该请求。" # Add other languages as needed
For more information, refer to Multilingual Refusal Messages.
Added support for NVIDIA GLiNER-PII for detecting entities such as names, email addresses, phone numbers, social security numbers, and more. For more information, refer to GLiNER Integration.
Breaking Changes#
A breaking change removes redundant streaming configuration for output rails. Prior to the change, streaming had to be enabled in two places:
streamingandrails.output.streaming.enabled. This change removes the top-levelstreamingconfiguration.Example
config.ymlbefore the change:models: - type: main engine: nvidia_ai_endpoints model: meta/llama-3.3-70b-instruct - type: content_safety engine: nvidia_ai_endpoints model: nvidia/llama-3.1-nemoguard-8b-content-safety rails: input: flows: - content safety check input $model=content_safety output: flows: - content safety check output $model=content_safety streaming: enabled: True chunk_size: 200 context_size: 50 streaming: True # No longer needed starting from v0.20.0
Example
config.ymlafter the change:models: - type: main engine: nvidia_ai_endpoints model: meta/llama-3.3-70b-instruct - type: content_safety engine: nvidia_ai_endpoints model: nvidia/llama-3.1-nemoguard-8b-content-safety rails: input: flows: - content safety check input $model=content_safety output: flows: - content safety check output $model=content_safety streaming: enabled: True chunk_size: 200 context_size: 50
For more information, refer to Streaming Generated Responses in Real-Time.
Other Changes#
Restructured the documentation with improved navigation, clearer content organization, and updated configuration reference and user guides.