Run Config
The run_config module defines runtime settings that control dataset generation behavior,
including early shutdown thresholds, batch sizing, and non-inference worker concurrency.
Usage
import data_designer.config as dd
from data_designer.interface import DataDesigner
data_designer = DataDesigner()
data_designer.set_run_config(dd.RunConfig(
buffer_size=500,
max_conversation_restarts=3,
))
API Reference
Classes:
| Name | Description |
|---|---|
RunConfig |
Runtime configuration for dataset generation. |
ThrottleConfig |
AIMD throttle tuning parameters for adaptive concurrency control. |
RunConfig
Bases: ConfigBase
Runtime configuration for dataset generation.
Groups configuration options that control generation behavior but aren't part of the dataset configuration itself.
Attributes:
| Name | Type | Description |
|---|---|---|
disable_early_shutdown |
bool
|
If True, disables the executor's early-shutdown behavior entirely. Generation will continue regardless of error rate, and the early-shutdown exception will never be raised. Error counts and summaries are still collected. Default is False. |
shutdown_error_rate |
float
|
Error rate threshold (0.0-1.0) that triggers early shutdown when early shutdown is enabled. Default is 0.5. |
shutdown_error_window |
int
|
Minimum number of completed tasks before error rate monitoring begins. Must be >= 1. Default is 10. |
buffer_size |
int
|
Number of records to process in each batch during dataset generation. A batch is processed end-to-end (column generation, post-batch processors, and writing the batch to artifact storage) before moving on to the next batch. Must be > 0. Default is 1000. |
non_inference_max_parallel_workers |
int
|
Maximum number of worker threads used for non-inference cell-by-cell generators. Must be >= 1. Default is 4. |
max_conversation_restarts |
int
|
Maximum number of full conversation restarts permitted when
generation tasks call |
max_conversation_correction_steps |
int
|
Maximum number of correction rounds permitted within a
single conversation when generation tasks call |
async_trace |
bool
|
If True, collect per-task tracing data when using the async engine (DATA_DESIGNER_ASYNC_ENGINE=1). Has no effect on the sync path. Default is False. |
throttle |
ThrottleConfig
|
AIMD throttle tuning parameters. See |
Methods:
| Name | Description |
|---|---|
normalize_shutdown_settings |
Normalize shutdown settings for compatibility. |
normalize_shutdown_settings()
Normalize shutdown settings for compatibility.
Source code in packages/data-designer-config/src/data_designer/config/run_config.py
110 111 112 113 114 115 | |
ThrottleConfig
Bases: ConfigBase
AIMD throttle tuning parameters for adaptive concurrency control.
These knobs configure the ThrottleManager that wraps every outbound
model HTTP request. The defaults are conservative and suitable for most
workloads; override only when you understand the trade-offs.
Attributes:
| Name | Type | Description |
|---|---|---|
reduce_factor |
float
|
Multiplicative decrease factor applied to the per-domain concurrency limit on a 429 / rate-limit signal. Must be in (0, 1). Default is 0.75 (reduce by 25% on rate-limit). |
additive_increase |
int
|
Additive increase step applied after every
|
success_window |
int
|
Number of consecutive successful releases before the additive increase is applied. Default is 25. |
cooldown_seconds |
float
|
Default cooldown duration (seconds) applied after a
rate-limit when the provider does not include a |
ceiling_overshoot |
float
|
Fraction above the observed rate-limit ceiling that additive increase is allowed to probe before capping. Default is 0.10 (10% overshoot). |