Skip to content

Routing Targets Through the Inference Gateway

AuditTarget.options is passed through to garak unchanged, with one exception: the plugin recognizes a nmp_uri_spec sentinel and resolves it to a concrete URI at run time using the NeMo Platform Inference Gateway. This keeps target definitions portable — you store provider references instead of host-specific URLs.

What nmp_uri_spec Is

nmp_uri_spec is a nested dict placed inside options.<generator> that names an Inference Gateway provider. When client.auditor.run(...) starts, the plugin walks the options tree, looks each nmp_uri_spec block up via sdk.inference.providers.retrieve(...), replaces it with a uri key whose value is the provider's resolved OpenAI-compatible URL, and hands the rewritten options dict to garak.

The original AuditTarget entity stays untouched — only the in-memory copy garak receives is rewritten.

Required Shape

{
    "nmp_uri_spec": {
        "inference_gateway": {
            "workspace": "<provider workspace>",
            "provider": "<provider name>",
        },
    },
}

Both workspace and provider are required. Other keys inside nmp_uri_spec are ignored — only inference_gateway is currently honored.

Conflict Rules

  • A dict cannot contain both uri and nmp_uri_spec — the plugin raises ValueError if it finds both. Pick one.
  • The plugin requires a connected SDK to resolve a nmp_uri_spec. If client.auditor.run(...) is invoked without an SDK handle (a code path reserved for tests), and the target's options contain a sentinel, the run raises RuntimeError.
  • If the named provider does not exist, or the provider lookup fails for any other reason, the run raises RuntimeError with the provider workspace and name embedded in the message.

Worked Example

Audit a NIM endpoint registered as the build provider in the default workspace:

target = client.auditor.targets.create(
    workspace="default",
    name="llama-31-8b",
    type="nim.NVOpenAIChat",
    model="meta/llama-3.1-8b-instruct",
    options={
        "nim": {
            "max_tokens": 1024,
            "nmp_uri_spec": {
                "inference_gateway": {
                    "workspace": "default",
                    "provider": "build",
                },
            },
        },
    },
)

When you run an audit against this target, the plugin internally rewrites options.nim to look like:

{
    "nim": {
        "max_tokens": 1024,
        "uri": "https://<your-platform>/v1/inference/.../openai/v1",
    },
}

The resolved uri reflects whatever the provider is configured to point at — change the provider once and every target that references it picks up the new endpoint on the next run.

When Not to Use nmp_uri_spec

For ad-hoc audits against a local endpoint you don't intend to keep around, you can specify uri directly and skip nmp_uri_spec:

target = client.auditor.targets.create(
    workspace="default",
    name="local-llm",
    type="openai.OpenAIGenerator",
    model="local-model",
    options={
        "openai": {
            "uri": "http://localhost:9000/v1",
        },
    },
)

This is fine for one-off testing. For anything you expect to re-run as the platform's providers evolve, prefer nmp_uri_spec so the indirection is preserved.