Run Config

The run_config module defines runtime settings that control dataset generation behavior, including early shutdown thresholds, batch sizing, and non-inference worker concurrency.

Usage

import data_designer.config as dd
from data_designer.interface import DataDesigner

data_designer = DataDesigner()
data_designer.set_run_config(dd.RunConfig(
    buffer_size=500,
    max_conversation_restarts=3,
))

API Reference

Classes:

Name	Description
`RunConfig`	Runtime configuration for dataset generation.
`ThrottleConfig`	AIMD throttle tuning parameters for adaptive concurrency control.

`RunConfig`

Bases: ConfigBase

Runtime configuration for dataset generation.

Groups configuration options that control generation behavior but aren't part of the dataset configuration itself.

Attributes:

Name	Type	Description
`disable_early_shutdown`	`bool`	If True, disables the executor's early-shutdown behavior entirely. Generation will continue regardless of error rate, and the early-shutdown exception will never be raised. Error counts and summaries are still collected. Default is False.
`shutdown_error_rate`	`float`	Error rate threshold (0.0-1.0) that triggers early shutdown when early shutdown is enabled. Default is 0.5.
`shutdown_error_window`	`int`	Minimum number of completed tasks before error rate monitoring begins. Must be >= 1. Default is 10.
`buffer_size`	`int`	Number of records to process in each batch during dataset generation. A batch is processed end-to-end (column generation, post-batch processors, and writing the batch to artifact storage) before moving on to the next batch. Must be > 0. Default is 1000.
`non_inference_max_parallel_workers`	`int`	Maximum number of worker threads used for non-inference cell-by-cell generators. Must be >= 1. Default is 4.
`max_conversation_restarts`	`int`	Maximum number of full conversation restarts permitted when generation tasks call `ModelFacade.generate(...)`. Must be >= 0. Default is 5.
`max_conversation_correction_steps`	`int`	Maximum number of correction rounds permitted within a single conversation when generation tasks call `ModelFacade.generate(...)`. Must be >= 0. Default is 0.
`async_trace`	`bool`	If True, collect per-task tracing data when using the async engine (DATA_DESIGNER_ASYNC_ENGINE=1). Has no effect on the sync path. Default is False.
`throttle`	`ThrottleConfig`	AIMD throttle tuning parameters. See `ThrottleConfig` for details.

Methods:

Name	Description
`normalize_shutdown_settings`	Normalize shutdown settings for compatibility.

`normalize_shutdown_settings()`

Normalize shutdown settings for compatibility.

Source code in packages/data-designer-config/src/data_designer/config/run_config.py

@model_validator(mode="after")
def normalize_shutdown_settings(self) -> Self:
    """Normalize shutdown settings for compatibility."""
    if self.disable_early_shutdown:
        self.shutdown_error_rate = 1.0
    return self

`ThrottleConfig`

Bases: ConfigBase

AIMD throttle tuning parameters for adaptive concurrency control.

These knobs configure the ThrottleManager that wraps every outbound model HTTP request. The defaults are conservative and suitable for most workloads; override only when you understand the trade-offs.

Attributes:

Name	Type	Description
`reduce_factor`	`float`	Multiplicative decrease factor applied to the per-domain concurrency limit on a 429 / rate-limit signal. Must be in (0, 1). Default is 0.75 (reduce by 25% on rate-limit).
`additive_increase`	`int`	Additive increase step applied after every `success_window` consecutive successes. Default is 1.
`success_window`	`int`	Number of consecutive successful releases before the additive increase is applied. Default is 25.
`cooldown_seconds`	`float`	Default cooldown duration (seconds) applied after a rate-limit when the provider does not include a `Retry-After` header. Default is 2.0.
`ceiling_overshoot`	`float`	Fraction above the observed rate-limit ceiling that additive increase is allowed to probe before capping. Default is 0.10 (10% overshoot).