unsloth_backend

`unsloth_backend` ¶

Optimized training backend using Unsloth.

Classes:

Name	Description
`UnslothTrainer`	Training backend using Unsloth for optimized LLM fine-tuning.

`UnslothTrainer(*args, **kwargs)` ¶

Bases: HuggingFaceBackend

Training backend using Unsloth for optimized LLM fine-tuning.

Extends HuggingFaceBackend to leverage Unsloth's optimized training routines, providing faster training speeds and reduced memory usage compared to standard HuggingFace implementations.

In addition to the arguments accepted by the parent class, **kwargs may include:

rope_scaling -- RoPE scaling configuration from model metadata.
torch_dtype -- Data type for model weights.
quantization_config -- Configuration for model quantization.

`maybe_quantize()` ¶

Apply PEFT wrapping via Unsloth's FastLanguageModel.get_peft_model.

This method configures and applies Parameter-Efficient Fine-Tuning (PEFT) using Unsloth's optimized implementation. The PEFT wrapping is always applied to ensure the adapter is saved correctly.

Note

Unlike the parent class implementation, this method uses Unsloth's FastLanguageModel.get_peft_model.

Source code in src/nemo_safe_synthesizer/training/unsloth_backend.py

def maybe_quantize(self):
    """Apply PEFT wrapping via Unsloth's ``FastLanguageModel.get_peft_model``.

    This method configures and applies Parameter-Efficient Fine-Tuning (PEFT)
    using Unsloth's optimized implementation. The PEFT wrapping is always
    applied to ensure the adapter is saved correctly.

    Note:
        Unlike the parent class implementation, this method uses Unsloth's
        ``FastLanguageModel.get_peft_model``.
    """
    from unsloth import FastLanguageModel  # ty: ignore[unresolved-import]

    self._prepare_quantize_base()
    qparams = self.quant_params.copy()
    # unsloth infers the task type from the model, so we need to remove it from the quant params
    qparams.pop("task_type", None)
    # Always wrap the model as a PEFT model to ensure adapter is saved correctly
    self.model = FastLanguageModel.get_peft_model(self.model, **qparams)

`load_model(**model_args)` ¶

Load a pretrained model using Unsloth's FastLanguageModel.

Applies a workaround that disables Unsloth's LLAMA32 support check to prevent unnecessary HuggingFace Hub requests, then calls :meth:prepare_config, :meth:_load_pretrained_model, and :meth:maybe_quantize in sequence.

Parameters:

Name	Type	Description	Default
`**model_args`		Additional keyword arguments for model configuration.	`{}`

Note

This method applies a workaround that disables Unsloth's LLAMA32 support check to prevent unnecessary HuggingFace Hub requests. See: https://github.com/unslothai/unsloth/blob/main/unsloth/models/loader.py#L235

Source code in src/nemo_safe_synthesizer/training/unsloth_backend.py

def load_model(self, **model_args):
    """Load a pretrained model using Unsloth's ``FastLanguageModel``.

    Applies a workaround that disables Unsloth's LLAMA32 support
    check to prevent unnecessary HuggingFace Hub requests, then
    calls :meth:`prepare_config`, :meth:`_load_pretrained_model`,
    and :meth:`maybe_quantize` in sequence.

    Args:
        **model_args: Additional keyword arguments for model configuration.

    Note:
        This method applies a workaround that disables Unsloth's LLAMA32
        support check to prevent unnecessary HuggingFace Hub requests.
        See: https://github.com/unslothai/unsloth/blob/main/unsloth/models/loader.py#L235
    """
    # NOTE: this hack stops unsloth from reaching out to huggingface, see
    # https://github.com/unslothai/unsloth/blob/main/unsloth/models/loader.py#L235
    from unsloth.models import loader  # ty: ignore[unresolved-import]

    loader.SUPPORTS_LLAMA32 = False
    logger.info(f"load_model: Loading model {self.params.training.pretrained_model} with args: {model_args}")

    self.prepare_config(**model_args)
    self._load_pretrained_model(**model_args)

    self.maybe_quantize(**model_args)

Name	Description
`maybe_quantize`	Apply PEFT wrapping via Unsloth's `FastLanguageModel.get_peft_model`.
`load_model`	Load a pretrained model using Unsloth's `FastLanguageModel`.

unsloth_backend

unsloth_backend ¶

UnslothTrainer(*args, **kwargs) ¶

maybe_quantize() ¶

load_model(**model_args) ¶

`unsloth_backend` ¶

`UnslothTrainer(*args, **kwargs)` ¶

`maybe_quantize()` ¶

`load_model(**model_args)` ¶