nemoguardrails.rails.llm.llmrails.LLMRails#

class nemoguardrails.rails.llm.llmrails.LLMRails[source]#

Bases: object

Rails based on a given configuration.

__init__(config, llm=None, verbose=False)[source]#

Initializes the LLMRails instance.

Parameters:
  • config (RailsConfig) – A rails configuration.

  • llm (Union[BaseChatModel, BaseLLM, None]) – An optional LLM engine to use. If provided, this will be used as the main LLM and will take precedence over any main LLM specified in the config.

  • verbose (bool) – Whether the logging should be verbose or not.

config: RailsConfig#
llm: Union[BaseChatModel, BaseLLM, None]#
runtime: Runtime#
update_llm(llm)[source]#

Replace the main LLM with the provided one.

Parameters:

llm – The new LLM that should be used.

async generate_async(
prompt=None,
messages=None,
options=None,
state=None,
streaming_handler=None,
)[source]#

Generate a completion or a next message.

The format for messages is the following:

```python
[

{“role”: “context”, “content”: {“user_name”: “John”}}, {“role”: “user”, “content”: “Hello! How are you?”}, {“role”: “assistant”, “content”: “I am fine, thank you!”}, {“role”: “event”, “event”: {“type”: “UserSilent”}}, …

]

```

Parameters:
  • prompt (Optional[str]) – The prompt to be used for completion.

  • messages (Optional[List[dict]]) – The history of messages to be used to generate the next message.

  • options (Union[dict, GenerationOptions, None]) – Options specific for the generation.

  • state (Union[dict, State, None]) – The state object that should be used as the starting point.

  • streaming_handler (Optional[StreamingHandler]) – If specified, and the config supports streaming, the provided handler will be used for streaming.

Return type:

Union[str, dict, GenerationResponse, Tuple[dict, dict]]

Returns:

The completion (when a prompt is provided) or the next message.

System messages are not yet supported.

stream_async(
prompt=None,
messages=None,
options=None,
state=None,
include_metadata=False,
generator=None,
include_generation_metadata=None,
)[source]#

Simplified interface for getting directly the streamed tokens from the LLM.

Return type:

AsyncIterator[Union[str, dict]]

Parameters:
  • prompt (str | None)

  • messages (List[dict] | None)

  • options (dict | GenerationOptions | None)

  • state (dict | State | None)

  • include_metadata (bool | None)

  • generator (AsyncIterator[str] | None)

  • include_generation_metadata (bool | None)

generate(
prompt=None,
messages=None,
options=None,
state=None,
)[source]#

Synchronous version of generate_async.

Parameters:
  • prompt (str | None)

  • messages (List[dict] | None)

  • options (dict | GenerationOptions | None)

  • state (dict | None)

async generate_events_async(events)[source]#

Generate the next events based on the provided history.

The format for events is the following:

```python
[

{“type”: “…”, …}, …

]

```

Parameters:
  • events (List[dict]) – The history of events to be used to generate the next events.

  • options – The options to be used for the generation.

Return type:

List[dict]

Returns:

The newly generate event(s).

generate_events(events)[source]#

Synchronous version of LLMRails.generate_events_async.

Return type:

List[dict]

Parameters:

events (List[dict])

async process_events_async(events, state=None, blocking=False)[source]#

Process a sequence of events in a given state.

The events will be processed one by one, in the input order.

Parameters:
  • events (List[dict]) – A sequence of events that needs to be processed.

  • state (Union[dict, State, None]) – The state that should be used as the starting point. If not provided, a clean state will be used.

  • blocking (bool)

Return type:

Tuple[List[dict], Union[dict, State]]

Returns:

(output_events, output_state) Returns a sequence of output events and an output

state.

process_events(events, state=None, blocking=False)[source]#

Synchronous version of LLMRails.process_events_async.

Return type:

Tuple[List[dict], Union[dict, State]]

Parameters:
  • events (List[dict])

  • state (dict | State | None)

  • blocking (bool)

async check_async(messages, rail_types=None)[source]#

Run rails on messages based on their content (asynchronous).

When rail_types is not provided, automatically determines which rails to run based on message roles: - Only user messages: runs input rails - Only assistant messages: runs output rails - Both user and assistant messages: runs both input and output rails - No user/assistant messages: logs warning and returns passing result

When rail_types is provided, runs exactly the specified rail types, skipping the auto-detection logic.

Parameters:
  • messages (List[dict]) – List of message dicts with ‘role’ and ‘content’ fields. Messages can contain any roles, but only user/assistant roles determine which rails execute when rail_types is not provided.

  • rails – Optional list of rail types to run, e.g. [RailType.INPUT] or [RailType.OUTPUT]. When provided, overrides automatic detection.

  • rail_types (List[RailType] | None)

Returns:

  • status: PASSED, MODIFIED, or BLOCKED

  • content: The final content after rails processing

  • rail: Name of the rail that blocked (if blocked)

Return type:

RailsResult containing

Examples

Check user input (auto-detected):

result = await rails.check_async([{“role”: “user”, “content”: “Hello!”}]) if result.status == RailStatus.BLOCKED:

print(f”Blocked by: {result.rail}”)

Check bot output with context (auto-detected):
result = await rails.check_async([

{“role”: “user”, “content”: “Hello!”}, {“role”: “assistant”, “content”: “Hi there!”}

])

Run only input rails explicitly:

result = await rails.check_async(messages, rail_types=[RailType.INPUT])

check(messages, rail_types=None)[source]#

Run rails on messages based on their content (synchronous).

This is a synchronous wrapper around check_async().

Parameters:
  • messages (List[dict]) – List of message dicts with ‘role’ and ‘content’ fields.

  • rails – Optional list of rail types to run. See check_async() for details.

  • rail_types (List[RailType] | None)

Return type:

RailsResult

Returns:

RailsResult containing status, content, and optional blocking rail name.

register_action(action, name=None)[source]#

Register a custom action for the rails configuration.

Return type:

Self

Parameters:
  • action (Callable)

  • name (str | None)

register_action_param(name, value)[source]#

Registers a custom action parameter.

Return type:

Self

Parameters:
  • name (str)

  • value (Any)

register_filter(filter_fn, name=None)[source]#

Register a custom filter for the rails configuration.

Return type:

Self

Parameters:
  • filter_fn (Callable)

  • name (str | None)

register_output_parser(output_parser, name)[source]#

Register a custom output parser for the rails configuration.

Return type:

Self

Parameters:
  • output_parser (Callable)

  • name (str)

register_prompt_context(name, value_or_fn)[source]#

Register a value to be included in the prompt context.

Name:

The name of the variable or function that will be used.

Value_or_fn:

The value or function that will be used to generate the value.

Return type:

Self

Parameters:
  • name (str)

  • value_or_fn (Any)

register_embedding_search_provider(name, cls)[source]#

Register a new embedding search provider.

Parameters:
  • name (str) – The name of the embedding search provider that will be used.

  • cls (Type[EmbeddingsIndex]) – The class that will be used to generate and search embedding

Return type:

Self

register_embedding_provider(cls, name=None)[source]#

Register a custom embedding provider.

Parameters:
  • model (Type[EmbeddingModel]) – The embedding model class.

  • name (str) – The name of the embedding engine. If available in the model, it will be used.

  • cls (Type[EmbeddingModel])

Raises:
  • ValueError – If the engine name is not provided and the model does not have an engine name.

  • ValueError – If the model does not have ‘encode’ or ‘encode_async’ methods.

Return type:

Self

explain()[source]#

Helper function to return the latest ExplainInfo object.

Return type:

ExplainInfo