data
data
¶
Classes:
| Name | Description |
|---|---|
DataParameters |
Configuration for grouping, ordering, and splitting input data for training and evaluation. |
DataParameters
pydantic-model
¶
Bases: Parameters
Configuration for grouping, ordering, and splitting input data for training and evaluation.
Fields:
-
group_training_examples_by(str | None) -
order_training_examples_by(str | None) -
max_sequences_per_example(OptionalAutoInt) -
holdout(float) -
max_holdout(int) -
random_state(int | None)
Validators:
group_training_examples_by
pydantic-field
¶
Column to group training examples by. This is useful when you want the model to learn inter-record correlations for a given grouping of records.
order_training_examples_by
pydantic-field
¶
Column to order training examples by. This is useful when you want the model to learn sequential relationships for a given ordering of records. If you provide this parameter, you must also provide group_training_examples_by.
max_sequences_per_example
pydantic-field
¶
If specified, adds at most this number of sequences per example. Supports 'auto' where a value of 1 is chosen if differential privacy is enabled, and 10 otherwise. If not specified or set to 'auto', fills up context. Required for DP to limit contribution of each example.
holdout
pydantic-field
¶
Amount of records to hold out for evaluation. If this is a float between 0 and 1, that ratio of records is held out. If an integer greater than 1, that number of records is held out. If the value is equal to zero, no holdout will be performed. Must be >= 0.
max_holdout
pydantic-field
¶
Maximum number of records to hold out. Overrides any behavior set by holdout. Must be >= 0.
random_state
pydantic-field
¶
Random state for holdout split to ensure reproducibility.
set_random_state_if_none(v)
pydantic-validator
¶
Generate a random state if none was provided.