Streaming Configuration#
NeMo Guardrails supports two levels of streaming configuration:
Global streaming - Controls LLM token generation
Output rail streaming - Controls how output rails process streamed tokens
Configuration Comparison#
Aspect |
Global |
Output Rail |
|---|---|---|
Scope |
LLM token generation |
Output rail processing |
Required for |
Any streaming |
Streaming with output rails |
Affects |
How LLM produces tokens |
How rails process token chunks |
Default |
|
|
Quick Example#
When using streaming with output rails, both configurations are required:
# Global: Enable LLM streaming
streaming: True
rails:
output:
flows:
- self check output
# Output rail streaming: Enable chunked processing
streaming:
enabled: True
chunk_size: 200
context_size: 50
Streaming Configuration Details#
The following guides provide detailed documentation for each streaming configuration area.
Enable streaming mode for LLM token generation in config.yml.
Configure how output rails process streamed tokens in chunked mode.