Skip to content

base

base

Base classes and contract for writing preflight checks.

Overview

This module defines the PreflightCheck ABC and its four stage-specific subclasses, plus the IssueCollector accumulator that check implementations append findings to. For plugin authors, this is the single import for the complete check-authoring surface.

The Generic[C] pattern

PreflightCheck is parameterised by a context view type C::

class PreflightCheck(ABC, Generic[C]):
    def check(self, ctx: C, collector: IssueCollector) -> None: ...

Each of the four stage ABCs binds C to a concrete frozen dataclass from types.py:

+------------------+--------------------+-------------------------------+ | Stage ABC | C bound to | Fields in ctx | +==================+====================+===============================+ | ConfigCheck | ConfigView | config | +------------------+--------------------+-------------------------------+ | DataFrameCheck | DataFrameView | config, data | +------------------+--------------------+-------------------------------+ | MetadataCheck | MetadataView | config, data, metadata | +------------------+--------------------+-------------------------------+ | AdvisoryCheck | DataFrameView | config, data | +------------------+--------------------+-------------------------------+

The purpose is type-safety without runtime overhead: the orchestrator always builds a single full PreflightContext and passes it to run(). run() calls _narrow(ctx) -- implemented once per stage ABC -- which constructs the appropriate view by slicing only the fields that stage is allowed to touch. check() then receives that narrowed view. If a ConfigCheck author accidentally writes ctx.data, the type-checker flags it immediately; at runtime the view object simply does not have that attribute so an AttributeError would surface too.

Why frozen dataclasses instead of Protocols ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Structural Protocol views were considered and rejected. A Protocol would allow any object with the right attributes to satisfy the constraint, which is useful for tests but hides the "this view was constructed by narrowing" intent. Frozen dataclasses enforce that views are always produced by _narrow(), keeping the narrowing path explicit and auditable.

enabled() vs check() ~~~~~~~~~~~~~~~~~~~~~ enabled(self, ctx: PreflightContext) always receives the full context because it runs before the stage dispatch -- its job is to decide whether the check should execute based on config, and it may need fields that the stage's view does not expose. Only check() receives the narrowed view.

What plugin authors need to do ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Subclass a stage ABC, set name / label, implement check(self, ctx: <ViewType>, collector). Do not override run() or _narrow(). The stage ABC handles both.

Classes:

Name Description
IssueCollector

Mutable accumulator for issues emitted by a single check run.

PreflightCheck

Base class for pre-flight validation checks.

ConfigCheck

Check that only needs the resolved config.

DataFrameCheck

Check that needs the training DataFrame and config.

MetadataCheck

Check that needs data, config, and model metadata.

AdvisoryCheck

Advisory data-quality check that needs data and config.

IssueCollector(check_name, _issues=list()) dataclass

Mutable accumulator for issues emitted by a single check run.

Created by PreflightCheck.run and passed to check(ctx, collector). Plugin authors call collector.error / collector.warning inside their check implementation; the orchestrator reads collector.issues after the call returns.

Attributes:

Name Type Description
check_name str

Fully-qualified name of the owning check (stamped on every issue for traceability back to the registry).

Methods:

Name Description
error

Emit an error-severity issue.

warning

Emit a warning-severity issue.

issues property

All issues accumulated so far, in emission order.

error(code, message)

Emit an error-severity issue.

Source code in src/nemo_safe_synthesizer/preflight/base.py
def error(self, code: str, message: str) -> None:
    """Emit an error-severity issue."""
    self._issues.append(
        PreflightIssue(
            code=code,
            severity="error",
            check=self.check_name,
            message=message,
        )
    )

warning(code, message)

Emit a warning-severity issue.

Source code in src/nemo_safe_synthesizer/preflight/base.py
def warning(self, code: str, message: str) -> None:
    """Emit a warning-severity issue."""
    self._issues.append(
        PreflightIssue(
            code=code,
            severity="warning",
            check=self.check_name,
            message=message,
        )
    )

PreflightCheck

Bases: ABC, Generic[C]

Base class for pre-flight validation checks.

Lifecycle: subclass -> set class attrs (name, label, requires) -> instantiate -> register via register_preflight_check -> _run_registry calls run(ctx) which narrows the context via _narrow() then delegates to the stage-specific check() method.

Subclasses must not override run() or _narrow() -- override check() instead. The stage subclass (ConfigCheck, DataFrameCheck, etc.) binds the generic parameter C to the appropriate view type and implements _narrow() so that check() receives only the fields it is allowed to access.

Writing a plugin check
  • Subclass one of the stage-specific ABCs (ConfigCheck, DataFrameCheck, MetadataCheck, AdvisoryCheck).
  • Define name, label, and stage class attributes. The first dotted segment of name must not match a reserved core namespace (see _CORE_NAMESPACES); this is enforced at registration time by register_preflight_check.
  • Keep __preflight_api_version__ at a value in _SUPPORTED_PREFLIGHT_API_VERSIONS (currently {1}).
  • Implement check(self, ctx, collector) where ctx is the view type for your stage (e.g. ConfigView for ConfigCheck).
  • Register an instance with register_preflight_check(MyCheck()) before run_preflight is invoked.
  • Opt out of a run by adding the check's name to config.preflight.disabled_checks.
  • Uncaught exceptions from enabled(), run(), or check() are reported as a synthetic PreflightIssue with code preflight.check_crash and do not halt the remaining registry.

Methods:

Name Description
run

Execute this check and return any issues it collected.

check

Perform the check, appending any findings to collector.

enabled

Whether this check should execute for ctx.

run(ctx)

Execute this check and return any issues it collected.

Subclasses implement check(ctx, collector) instead of overriding run. The base implementation narrows the context via _narrow(), wires an IssueCollector, and returns its accumulated issues.

Source code in src/nemo_safe_synthesizer/preflight/base.py
def run(self, ctx: PreflightContext) -> list[PreflightIssue]:
    """Execute this check and return any issues it collected.

    Subclasses implement ``check(ctx, collector)`` instead of
    overriding ``run``. The base implementation narrows the context
    via ``_narrow()``, wires an ``IssueCollector``, and returns its
    accumulated issues.
    """
    collector = IssueCollector(check_name=self.name)
    self.check(self._narrow(ctx), collector)
    return collector.issues

check(ctx, collector) abstractmethod

Perform the check, appending any findings to collector.

ctx is the stage-specific view produced by _narrow():

  • ConfigCheckConfigView (ctx.config only)
  • DataFrameCheck / AdvisoryCheckDataFrameView (ctx.config + ctx.data)
  • MetadataCheckMetadataView (all three fields)
Source code in src/nemo_safe_synthesizer/preflight/base.py
@abstractmethod
def check(self, ctx: C, collector: IssueCollector) -> None:
    """Perform the check, appending any findings to ``collector``.

    ``ctx`` is the stage-specific view produced by ``_narrow()``:

    - ``ConfigCheck`` → ``ConfigView`` (``ctx.config`` only)
    - ``DataFrameCheck`` / ``AdvisoryCheck`` → ``DataFrameView``
      (``ctx.config`` + ``ctx.data``)
    - ``MetadataCheck`` → ``MetadataView`` (all three fields)
    """
    raise NotImplementedError

enabled(ctx)

Whether this check should execute for ctx.

The default implementation honors ctx.config.preflight.disabled_checks. Override to add declarative skip logic based on config state.

Note

enabled() always receives the full PreflightContext (not a narrowed view) because it runs before the stage dispatch and needs access to config regardless of stage.

Source code in src/nemo_safe_synthesizer/preflight/base.py
def enabled(self, ctx: PreflightContext) -> bool:
    """Whether this check should execute for ``ctx``.

    The default implementation honors
    ``ctx.config.preflight.disabled_checks``. Override to add
    declarative skip logic based on config state.

    Note:
        ``enabled()`` always receives the full ``PreflightContext``
        (not a narrowed view) because it runs before the stage
        dispatch and needs access to config regardless of stage.
    """
    return self.name not in ctx.config.preflight.disabled_checks

ConfigCheck

Bases: PreflightCheck[ConfigView]

Check that only needs the resolved config.

Concrete subclasses implement check(self, ctx: ConfigView, collector) and may access only ctx.config. Accessing ctx.data or ctx.metadata is a type error.

DataFrameCheck

Bases: PreflightCheck[DataFrameView]

Check that needs the training DataFrame and config.

Concrete subclasses implement check(self, ctx: DataFrameView, collector) and may access ctx.config and ctx.data. Accessing ctx.metadata is a type error.

MetadataCheck

Bases: PreflightCheck[MetadataView]

Check that needs data, config, and model metadata.

Concrete subclasses implement check(self, ctx: MetadataView, collector) and may access all three fields: ctx.config, ctx.data, and ctx.metadata.

AdvisoryCheck

Bases: PreflightCheck[DataFrameView]

Advisory data-quality check that needs data and config.

Uses the ADVISORY stage. Concrete subclasses implement check(self, ctx: DataFrameView, collector) and may access ctx.config and ctx.data.

_run_registry skips the errored_checks bookkeeping for advisory checks, so errors they emit are surfaced in the report but never gate downstream checks via requires.