Models
The models module defines configuration objects for model-based generation. ModelProvider, specifies connection and authentication details for custom providers. ModelConfig encapsulates model details including the model alias, identifier, and inference parameters. Inference Parameters controls model behavior through settings like temperature, top_p, and max_tokens, with support for both fixed values and distribution-based sampling. The module includes ImageContext for providing image inputs to multimodal models.
For more information on how they are used, see below:
Classes:
| Name | Description |
|---|---|
BaseInferenceParams |
Base configuration for inference parameters. |
ChatCompletionInferenceParams |
Configuration for LLM inference parameters. |
DistributionType |
Types of distributions for sampling inference parameters. |
EmbeddingInferenceParams |
Configuration for embedding generation parameters. |
ImageContext |
Configuration for providing image context to multimodal models. |
ImageFormat |
Supported image formats for image modality. |
ManualDistribution |
Manual (discrete) distribution for sampling inference parameters. |
ManualDistributionParams |
Parameters for manual distribution sampling. |
Modality |
Supported modality types for multimodal model data. |
ModalityDataType |
Data type formats for multimodal data. |
ModelConfig |
Configuration for a model used for generation. |
ModelProvider |
Configuration for a custom model provider. |
UniformDistribution |
Uniform distribution for sampling inference parameters. |
UniformDistributionParams |
Parameters for uniform distribution sampling. |
BaseInferenceParams
Bases: ConfigBase, ABC
Base configuration for inference parameters.
Attributes:
| Name | Type | Description |
|---|---|---|
generation_type |
GenerationType
|
Type of generation (chat-completion or embedding). Acts as discriminator. |
max_parallel_requests |
int
|
Maximum number of parallel requests to the model API. |
timeout |
int | None
|
Timeout in seconds for each request. |
extra_body |
dict[str, Any] | None
|
Additional parameters to pass to the model API. |
Methods:
| Name | Description |
|---|---|
format_for_display |
Format inference parameters for display. |
generate_kwargs
property
Get the generate kwargs for the inference parameters.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dictionary of the generate kwargs. |
format_for_display()
Format inference parameters for display.
Returns:
| Type | Description |
|---|---|
str
|
Formatted string of inference parameters |
Source code in src/data_designer/config/models.py
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 | |
ChatCompletionInferenceParams
Bases: BaseInferenceParams
Configuration for LLM inference parameters.
Attributes:
| Name | Type | Description |
|---|---|---|
generation_type |
Literal[CHAT_COMPLETION]
|
Type of generation, always "chat-completion" for this class. |
temperature |
float | DistributionT | None
|
Sampling temperature (0.0-2.0). Can be a fixed value or a distribution for dynamic sampling. |
top_p |
float | DistributionT | None
|
Nucleus sampling probability (0.0-1.0). Can be a fixed value or a distribution for dynamic sampling. |
max_tokens |
int | None
|
Maximum number of tokens to generate in the response. |
DistributionType
Bases: str, Enum
Types of distributions for sampling inference parameters.
EmbeddingInferenceParams
Bases: BaseInferenceParams
Configuration for embedding generation parameters.
Attributes:
| Name | Type | Description |
|---|---|---|
generation_type |
Literal[EMBEDDING]
|
Type of generation, always "embedding" for this class. |
encoding_format |
Literal['float', 'base64']
|
Format of the embedding encoding ("float" or "base64"). |
dimensions |
int | None
|
Number of dimensions for the embedding. |
ImageContext
Bases: ModalityContext
Configuration for providing image context to multimodal models.
Attributes:
| Name | Type | Description |
|---|---|---|
modality |
Modality
|
The modality type (always "image"). |
column_name |
str
|
Name of the column containing image data. |
data_type |
ModalityDataType
|
Format of the image data ("url" or "base64"). |
image_format |
ImageFormat | None
|
Image format (required for base64 data). |
Methods:
| Name | Description |
|---|---|
get_context |
Get the context for the image modality. |
get_context(record)
Get the context for the image modality.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
record
|
dict
|
The record containing the image data. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
The context for the image modality. |
Source code in src/data_designer/config/models.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | |
ImageFormat
Bases: str, Enum
Supported image formats for image modality.
ManualDistribution
Bases: Distribution[ManualDistributionParams]
Manual (discrete) distribution for sampling inference parameters.
Samples from a discrete set of values with optional weights. Useful for testing specific values or creating custom probability distributions for temperature or top_p.
Attributes:
| Name | Type | Description |
|---|---|---|
distribution_type |
DistributionType | None
|
Type of distribution ("manual"). |
params |
ManualDistributionParams
|
Distribution parameters (values, weights). |
Methods:
| Name | Description |
|---|---|
sample |
Sample a value from the manual distribution. |
sample()
Sample a value from the manual distribution.
Returns:
| Type | Description |
|---|---|
float
|
A float value sampled from the manual distribution. |
Source code in src/data_designer/config/models.py
160 161 162 163 164 165 166 | |
ManualDistributionParams
Bases: ConfigBase
Parameters for manual distribution sampling.
Attributes:
| Name | Type | Description |
|---|---|---|
values |
list[float]
|
List of possible values to sample from. |
weights |
list[float] | None
|
Optional list of weights for each value. If not provided, all values have equal probability. |
Modality
Bases: str, Enum
Supported modality types for multimodal model data.
ModalityDataType
Bases: str, Enum
Data type formats for multimodal data.
ModelConfig
Bases: ConfigBase
Configuration for a model used for generation.
Attributes:
| Name | Type | Description |
|---|---|---|
alias |
str
|
User-defined alias to reference in column configurations. |
model |
str
|
Model identifier (e.g., from build.nvidia.com or other providers). |
inference_parameters |
InferenceParamsT
|
Inference parameters for the model (temperature, top_p, max_tokens, etc.). The generation_type is determined by the type of inference_parameters. |
provider |
str | None
|
Optional model provider name if using custom providers. |
generation_type
property
Get the generation type from the inference parameters.
ModelProvider
Bases: ConfigBase
Configuration for a custom model provider.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Name of the model provider. |
endpoint |
str
|
API endpoint URL for the provider. |
provider_type |
str
|
Provider type (default: "openai"). Determines the API format to use. |
api_key |
str | None
|
Optional API key for authentication. |
extra_body |
dict[str, Any] | None
|
Additional parameters to pass in API requests. |
extra_headers |
dict[str, str] | None
|
Additional headers to pass in API requests. |
UniformDistribution
Bases: Distribution[UniformDistributionParams]
Uniform distribution for sampling inference parameters.
Samples values uniformly between low and high bounds. Useful for exploring a continuous range of values for temperature or top_p.
Attributes:
| Name | Type | Description |
|---|---|---|
distribution_type |
DistributionType | None
|
Type of distribution ("uniform"). |
params |
UniformDistributionParams
|
Distribution parameters (low, high). |
Methods:
| Name | Description |
|---|---|
sample |
Sample a value from the uniform distribution. |
sample()
Sample a value from the uniform distribution.
Returns:
| Type | Description |
|---|---|
float
|
A float value sampled from the uniform distribution. |
Source code in src/data_designer/config/models.py
201 202 203 204 205 206 207 | |
UniformDistributionParams
Bases: ConfigBase
Parameters for uniform distribution sampling.
Attributes:
| Name | Type | Description |
|---|---|---|
low |
float
|
Lower bound (inclusive). |
high |
float
|
Upper bound (exclusive). |