regex_manager
regex_manager
¶
JSON-schema-to-regex compiler for structured generation.
Converts a subset of JSON Schema into a regular expression that can be used by vLLM's structured-output backend to constrain model output to valid JSONL records.
Functions:
| Name | Description |
|---|---|
build_json_based_regex |
Build a regex that constrains LLM output to valid JSONL records. |
build_json_based_regex(schema, config, bos_token, eos_token, whitespace_pattern=None)
¶
Build a regex that constrains LLM output to valid JSONL records.
Supports properties, required, enum, primitive type values,
arrays/objects with min/max item or property counts, string length bounds,
pattern, and format values for date-time, date, time, and UUID.
Use vLLM's native JSON schema structured-output path for unsupported schema
features such as additionalProperties, composition keywords, and
$ref.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
schema
|
dict[str, Any]
|
JSON schema dictionary describing the record format. |
required |
config
|
SafeSynthesizerParameters
|
Pipeline configuration (used for grouping and structured-generation settings). |
required |
bos_token
|
str
|
Beginning-of-sequence token (used when grouping). |
required |
eos_token
|
str
|
End-of-sequence token (used when grouping). |
required |
whitespace_pattern
|
str | None
|
Optional regex fragment for matching whitespace between JSON tokens. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
Compiled regex string suitable for vLLM's structured-output |
str
|
backend. |