Output Streaming Configuration#

The NeMo Guardrails library supports streaming out of the box when using the stream_async() method. No configuration is required to enable basic streaming.

When you have output rails configured, you need to explicitly enable streaming for them to process tokens in chunked mode.

Quick Example#

When using streaming with output rails:

rails:
  output:
    flows:
      - self check output
    streaming:
      enabled: True
      chunk_size: 200
      context_size: 50

Streaming Configuration Details#

The following guides provide detailed documentation for streaming configuration.

Streaming LLM Responses

Enable and use streaming mode for LLM responses in real-time in the NeMo Guardrails library.

Streaming LLM Responses in Real-Time
Output Rail Streaming

Configure how output rails process streamed tokens in chunked mode.

Output Rail Streaming