backend
backend
¶
Abstract generator backend.
Classes:
| Name | Description |
|---|---|
GeneratorBackend |
Abstract base class for generation backends. |
GeneratorBackend
¶
Abstract base class for generation backends.
Lifecycle: initialize -> prepare_params -> generate
[-> generate ...] -> teardown.
teardown must be idempotent and safe to call multiple times.
Callers should use try/finally to guarantee teardown runs
even if generate raises. Each cleanup step should be isolated
so one failure doesn't prevent the next from running.
Subclasses must implement initialize, prepare_params,
generate, and teardown. The _torn_down guard flag
pattern is recommended for teardown implementations.
Methods:
| Name | Description |
|---|---|
initialize |
Load the model and any required resources into memory. |
prepare_params |
Translate caller-supplied sampling parameters into a backend-native form. |
generate |
Run the batch generation loop and return aggregated results. |
teardown |
Release all resources held by this backend. |
Attributes:
| Name | Type | Description |
|---|---|---|
gen_method |
Callable | None
|
Callable used internally for LLM generation. |
gen_results |
GenerateJobResults
|
Results from the most recent generation run. |
config |
SafeSynthesizerParameters
|
Pipeline configuration. |
model_metadata |
ModelMetadata
|
Metadata for the fine-tuned model (prompt template, sequence length, adapter path, etc.). |
remote |
bool
|
Whether the backend calls a remote inference endpoint. |
elapsed_time |
float
|
Wall-clock duration of the last generation run in seconds. |
workdir |
Workdir
|
Working directory containing model artifacts. |
gen_method = None
class-attribute
instance-attribute
¶
Callable used internally for LLM generation.
gen_results
instance-attribute
¶
Results from the most recent generation run.
config
instance-attribute
¶
Pipeline configuration.
model_metadata
instance-attribute
¶
Metadata for the fine-tuned model (prompt template, sequence length, adapter path, etc.).
remote
instance-attribute
¶
Whether the backend calls a remote inference endpoint.
elapsed_time
instance-attribute
¶
Wall-clock duration of the last generation run in seconds.
workdir
instance-attribute
¶
Working directory containing model artifacts.
initialize()
abstractmethod
¶
Load the model and any required resources into memory.
Called once before the first generate() invocation.
Implementations should allocate GPU memory, instantiate the
inference engine (e.g. vLLM), load LoRA adapters, and configure
backend-specific settings such as attention backends or
structured-output support.
After this method returns, the backend must be ready to accept
prepare_params() and generate() calls.
Source code in src/nemo_safe_synthesizer/generation/backend.py
prepare_params(**kwargs)
abstractmethod
¶
Translate caller-supplied sampling parameters into a backend-native form.
Resolves, validates, and transforms high-level generation
parameters (temperature, top-p, max tokens, structured-output
constraints, etc.) into the format expected by the underlying
inference engine. The result is stored internally so that
subsequent generate() calls use these settings.
Must be called after initialize() and before generate().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Sampling parameters such as |
{}
|
Source code in src/nemo_safe_synthesizer/generation/backend.py
generate(data_actions_fn=None)
abstractmethod
¶
Run the batch generation loop and return aggregated results.
Repeatedly prompts the model, processes each batch through the
configured
Processor,
and accumulates valid records until the target count is reached
or a stopping condition fires (e.g. too many consecutive invalid
batches). Progress and error statistics are logged after each
batch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_actions_fn
|
DataActionsFn | None
|
Optional post-processing / validation function applied to each batch of generated records. Typically reverses training-time preprocessing and enforces user-specified data constraints. |
None
|
Returns:
| Type | Description |
|---|---|
GenerateJobResults
|
Results containing the generated DataFrame, validity |
GenerateJobResults
|
statistics, and timing information. |
Source code in src/nemo_safe_synthesizer/generation/backend.py
teardown()
abstractmethod
¶
Release all resources held by this backend.
Frees GPU memory, destroys distributed process groups, and
cleans up any temporary state. Must be idempotent -- safe to
call multiple times. Implementations should use the
_torn_down guard flag and isolate each cleanup step so one
failure doesn't prevent subsequent cleanup.
Callers should wrap generate() in try/finally to
guarantee this runs even when generation raises.