Skip to content

Run Config

The run_config module defines runtime settings that control dataset generation behavior, including early shutdown thresholds, batch sizing, and non-inference worker concurrency.

Usage

import data_designer.config as dd
from data_designer.interface import DataDesigner

data_designer = DataDesigner()
data_designer.set_run_config(dd.RunConfig(
    buffer_size=500,
    max_conversation_restarts=3,
))

API Reference

Classes:

Name Description
RunConfig

Runtime configuration for dataset generation.

ThrottleConfig

AIMD throttle tuning parameters for adaptive concurrency control.

RunConfig

Bases: ConfigBase

Runtime configuration for dataset generation.

Groups configuration options that control generation behavior but aren't part of the dataset configuration itself.

Attributes:

Name Type Description
disable_early_shutdown bool

If True, disables the executor's early-shutdown behavior entirely. Generation will continue regardless of error rate, and the early-shutdown exception will never be raised. Error counts and summaries are still collected. Default is False.

shutdown_error_rate float

Error rate threshold (0.0-1.0) that triggers early shutdown when early shutdown is enabled. Default is 0.5.

shutdown_error_window int

Minimum number of completed tasks before error rate monitoring begins. Must be >= 1. Default is 10.

buffer_size int

Number of records to process in each batch during dataset generation. A batch is processed end-to-end (column generation, post-batch processors, and writing the batch to artifact storage) before moving on to the next batch. Must be > 0. Default is 1000.

non_inference_max_parallel_workers int

Maximum number of worker threads used for non-inference cell-by-cell generators. Must be >= 1. Default is 4.

max_conversation_restarts int

Maximum number of full conversation restarts permitted when generation tasks call ModelFacade.generate(...). Must be >= 0. Default is 5.

max_conversation_correction_steps int

Maximum number of correction rounds permitted within a single conversation when generation tasks call ModelFacade.generate(...). Must be >= 0. Default is 0.

async_trace bool

If True, collect per-task tracing data when using the async engine (DATA_DESIGNER_ASYNC_ENGINE=1). Has no effect on the sync path. Default is False.

throttle ThrottleConfig

AIMD throttle tuning parameters. See ThrottleConfig for details.

Methods:

Name Description
normalize_shutdown_settings

Normalize shutdown settings for compatibility.

normalize_shutdown_settings()

Normalize shutdown settings for compatibility.

Source code in packages/data-designer-config/src/data_designer/config/run_config.py
110
111
112
113
114
115
@model_validator(mode="after")
def normalize_shutdown_settings(self) -> Self:
    """Normalize shutdown settings for compatibility."""
    if self.disable_early_shutdown:
        self.shutdown_error_rate = 1.0
    return self

ThrottleConfig

Bases: ConfigBase

AIMD throttle tuning parameters for adaptive concurrency control.

These knobs configure the ThrottleManager that wraps every outbound model HTTP request. The defaults are conservative and suitable for most workloads; override only when you understand the trade-offs.

Attributes:

Name Type Description
reduce_factor float

Multiplicative decrease factor applied to the per-domain concurrency limit on a 429 / rate-limit signal. Must be in (0, 1). Default is 0.75 (reduce by 25% on rate-limit).

additive_increase int

Additive increase step applied after every success_window consecutive successes. Default is 1.

success_window int

Number of consecutive successful releases before the additive increase is applied. Default is 25.

cooldown_seconds float

Default cooldown duration (seconds) applied after a rate-limit when the provider does not include a Retry-After header. Default is 2.0.

ceiling_overshoot float

Fraction above the observed rate-limit ceiling that additive increase is allowed to probe before capping. Default is 0.10 (10% overshoot).