batch
batch
¶
Single-batch container for generated records and error statistics.
Classes:
| Name | Description |
|---|---|
Batch |
Container for the results of a single generation batch. |
Batch(processor)
¶
Container for the results of a single generation batch.
Collects
ParsedResponse
objects produced by the processor and exposes aggregate counts and error
statistics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
processor
|
Processor
|
The processor used to parse LLM outputs into records. |
required |
Methods:
| Name | Description |
|---|---|
error_statistics |
Return count statistics on errors encountered during generation. |
to_dataframe |
Return the valid records as a normalized DataFrame. |
log_summary |
Log a summary of the batch generation results. |
process |
Process text response from a single prompt in the current batch. |
Attributes:
| Name | Type | Description |
|---|---|---|
num_prompts |
int
|
Total number of prompts submitted in this batch. |
num_invalid_records |
int
|
Number of invalid records generated in this batch. |
num_valid_records |
int
|
Number of valid records generated in this batch. |
data_config_rejected_records |
list[tuple[str, str]]
|
Error tuples for records rejected by |
num_data_config_rejected_records |
int
|
Count of records rejected by |
valid_record_fraction |
float
|
Fraction of generated records that passed validation. |
stopping_metric |
float
|
Invalid record fraction, used by |
Source code in src/nemo_safe_synthesizer/generation/batch.py
num_prompts
property
¶
Total number of prompts submitted in this batch.
num_invalid_records
property
¶
Number of invalid records generated in this batch.
num_valid_records
property
¶
Number of valid records generated in this batch.
data_config_rejected_records
property
¶
Error tuples for records rejected by data_config validation.
num_data_config_rejected_records
property
¶
Count of records rejected by data_config validation.
valid_record_fraction
property
¶
Fraction of generated records that passed validation.
stopping_metric
property
¶
Invalid record fraction, used by
GenerationStopCondition.
error_statistics(detailed_errors)
¶
Return count statistics on errors encountered during generation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
detailed_errors
|
bool
|
If |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame indexed by error message with a |
DataFrame
|
column, sorted by frequency descending. |
Source code in src/nemo_safe_synthesizer/generation/batch.py
to_dataframe()
¶
Return the valid records as a normalized DataFrame.
Returns:
| Type | Description |
|---|---|
DataFrame | None
|
DataFrame of valid records, or |
DataFrame | None
|
were generated. |
Source code in src/nemo_safe_synthesizer/generation/batch.py
log_summary(detailed_errors=False)
¶
Log a summary of the batch generation results.
Emits structured data via logger.user.info that is rendered
as Rich ASCII tables on the console and as key/value pairs in
JSON logs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
detailed_errors
|
bool
|
If |
False
|
Source code in src/nemo_safe_synthesizer/generation/batch.py
process(prompt_number, text)
¶
Process text response from a single prompt in the current batch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt_number
|
int
|
The prompt number in the current batch. |
required |
text
|
str
|
Text generated by the fine-tuned model. |
required |