base
base
¶
Base classes and contract for writing preflight checks.
Overview¶
This module defines the PreflightCheck ABC and its four stage-specific
subclasses, plus the IssueCollector accumulator that check
implementations append findings to. For plugin authors, this is the
single import for the complete check-authoring surface.
The Generic[C] pattern¶
PreflightCheck is parameterised by a context view type C::
class PreflightCheck(ABC, Generic[C]):
def check(self, ctx: C, collector: IssueCollector) -> None: ...
Each of the four stage ABCs binds C to a concrete frozen dataclass
from types.py:
+------------------+--------------------+-------------------------------+ | Stage ABC | C bound to | Fields in ctx | +==================+====================+===============================+ | ConfigCheck | ConfigView | config | +------------------+--------------------+-------------------------------+ | DataFrameCheck | DataFrameView | config, data | +------------------+--------------------+-------------------------------+ | MetadataCheck | MetadataView | config, data, metadata | +------------------+--------------------+-------------------------------+ | AdvisoryCheck | DataFrameView | config, data | +------------------+--------------------+-------------------------------+
The purpose is type-safety without runtime overhead: the orchestrator
always builds a single full PreflightContext and passes it to
run(). run() calls _narrow(ctx) -- implemented once per
stage ABC -- which constructs the appropriate view by slicing only the
fields that stage is allowed to touch. check() then receives that
narrowed view. If a ConfigCheck author accidentally writes
ctx.data, the type-checker flags it immediately; at runtime the view
object simply does not have that attribute so an AttributeError would
surface too.
Why frozen dataclasses instead of Protocols
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Structural Protocol views were considered and rejected. A Protocol
would allow any object with the right attributes to satisfy the
constraint, which is useful for tests but hides the "this view was
constructed by narrowing" intent. Frozen dataclasses enforce that views
are always produced by _narrow(), keeping the narrowing path
explicit and auditable.
enabled() vs check()
~~~~~~~~~~~~~~~~~~~~~
enabled(self, ctx: PreflightContext) always receives the full context
because it runs before the stage dispatch -- its job is to decide whether
the check should execute based on config, and it may need fields that the
stage's view does not expose. Only check() receives the narrowed view.
What plugin authors need to do
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Subclass a stage ABC, set name / label, implement
check(self, ctx: <ViewType>, collector). Do not override
run() or _narrow(). The stage ABC handles both.
Classes:
| Name | Description |
|---|---|
IssueCollector |
Mutable accumulator for issues emitted by a single check run. |
PreflightCheck |
Base class for pre-flight validation checks. |
ConfigCheck |
Check that only needs the resolved config. |
DataFrameCheck |
Check that needs the training DataFrame and config. |
MetadataCheck |
Check that needs data, config, and model metadata. |
AdvisoryCheck |
Advisory data-quality check that needs data and config. |
IssueCollector(check_name, _issues=list())
dataclass
¶
Mutable accumulator for issues emitted by a single check run.
Created by PreflightCheck.run and passed to check(ctx, collector).
Plugin authors call collector.error / collector.warning inside
their check implementation; the orchestrator reads collector.issues
after the call returns.
Attributes:
| Name | Type | Description |
|---|---|---|
check_name |
str
|
Fully-qualified name of the owning check (stamped on every issue for traceability back to the registry). |
Methods:
| Name | Description |
|---|---|
error |
Emit an error-severity issue. |
warning |
Emit a warning-severity issue. |
PreflightCheck
¶
Bases: ABC, Generic[C]
Base class for pre-flight validation checks.
Lifecycle: subclass -> set class attrs (name, label, requires) ->
instantiate -> register via register_preflight_check -> _run_registry
calls run(ctx) which narrows the context via _narrow() then
delegates to the stage-specific check() method.
Subclasses must not override run() or _narrow() -- override
check() instead. The stage subclass (ConfigCheck,
DataFrameCheck, etc.) binds the generic parameter C to the
appropriate view type and implements _narrow() so that check()
receives only the fields it is allowed to access.
Writing a plugin check
- Subclass one of the stage-specific ABCs (
ConfigCheck,DataFrameCheck,MetadataCheck,AdvisoryCheck). - Define
name,label, andstageclass attributes. The first dotted segment ofnamemust not match a reserved core namespace (see_CORE_NAMESPACES); this is enforced at registration time byregister_preflight_check. - Keep
__preflight_api_version__at a value in_SUPPORTED_PREFLIGHT_API_VERSIONS(currently{1}). - Implement
check(self, ctx, collector)wherectxis the view type for your stage (e.g.ConfigViewforConfigCheck). - Register an instance with
register_preflight_check(MyCheck())beforerun_preflightis invoked. - Opt out of a run by adding the check's
nametoconfig.preflight.disabled_checks. - Uncaught exceptions from
enabled(),run(), orcheck()are reported as a syntheticPreflightIssuewith codepreflight.check_crashand do not halt the remaining registry.
Methods:
| Name | Description |
|---|---|
run |
Execute this check and return any issues it collected. |
check |
Perform the check, appending any findings to |
enabled |
Whether this check should execute for |
run(ctx)
¶
Execute this check and return any issues it collected.
Subclasses implement check(ctx, collector) instead of
overriding run. The base implementation narrows the context
via _narrow(), wires an IssueCollector, and returns its
accumulated issues.
Source code in src/nemo_safe_synthesizer/preflight/base.py
check(ctx, collector)
abstractmethod
¶
Perform the check, appending any findings to collector.
ctx is the stage-specific view produced by _narrow():
ConfigCheck→ConfigView(ctx.configonly)DataFrameCheck/AdvisoryCheck→DataFrameView(ctx.config+ctx.data)MetadataCheck→MetadataView(all three fields)
Source code in src/nemo_safe_synthesizer/preflight/base.py
enabled(ctx)
¶
Whether this check should execute for ctx.
The default implementation honors
ctx.config.preflight.disabled_checks. Override to add
declarative skip logic based on config state.
Note
enabled() always receives the full PreflightContext
(not a narrowed view) because it runs before the stage
dispatch and needs access to config regardless of stage.
Source code in src/nemo_safe_synthesizer/preflight/base.py
ConfigCheck
¶
Bases: PreflightCheck[ConfigView]
Check that only needs the resolved config.
Concrete subclasses implement check(self, ctx: ConfigView, collector)
and may access only ctx.config. Accessing ctx.data or
ctx.metadata is a type error.
DataFrameCheck
¶
Bases: PreflightCheck[DataFrameView]
Check that needs the training DataFrame and config.
Concrete subclasses implement check(self, ctx: DataFrameView, collector)
and may access ctx.config and ctx.data. Accessing
ctx.metadata is a type error.
MetadataCheck
¶
Bases: PreflightCheck[MetadataView]
Check that needs data, config, and model metadata.
Concrete subclasses implement check(self, ctx: MetadataView, collector)
and may access all three fields: ctx.config, ctx.data, and
ctx.metadata.
AdvisoryCheck
¶
Bases: PreflightCheck[DataFrameView]
Advisory data-quality check that needs data and config.
Uses the ADVISORY stage. Concrete subclasses implement
check(self, ctx: DataFrameView, collector) and may access
ctx.config and ctx.data.
_run_registry skips the errored_checks bookkeeping for
advisory checks, so errors they emit are surfaced in the report but
never gate downstream checks via requires.