Skip to content

labels

labels

Classes:

Name Description
LabelEvaluator

Evaluates labels specified by the user in the config and provides a simple

LabelEvaluator(explicit_labels, label_regexes)

Evaluates labels specified by the user in the config and provides a simple interface other places in the code that use that label configuration.

One notable example is expanding wildcards from the label config (e.g. acme/* or *).

Methods:

Name Description
filter_labels

Filters provided list of labels against configured labels and label regexes.

create_from_config

Loads labels defined by the user in the config.

Source code in src/nemo_safe_synthesizer/pii_replacer/ner/labels.py
def __init__(self, explicit_labels: set[str], label_regexes: list[Pattern]):
    self._explicit_labels = normalize_labels(explicit_labels)
    self._label_regexes = label_regexes

filter_labels(labels)

Filters provided list of labels against configured labels and label regexes.

Example::

evaluator = LabelEvaluator(explicit_labels=["test"], label_regexes=["^acme/.*$"])
filtered = evaluator.filter_labels(["test", "test_2", "acme/abc", "test/test"])
assert list(filtered) == "test", "acme/abc"

Parameters:

Name Type Description Default
labels list[str]

List of labels to be filtered.

required

Returns: Filtered labels as they are calculated.

Source code in src/nemo_safe_synthesizer/pii_replacer/ner/labels.py
def filter_labels(self, labels: list[str]) -> Iterator[str]:
    """
    Filters provided list of labels against configured labels and label regexes.

    Example::

        evaluator = LabelEvaluator(explicit_labels=["test"], label_regexes=["^acme/.*$"])
        filtered = evaluator.filter_labels(["test", "test_2", "acme/abc", "test/test"])
        assert list(filtered) == "test", "acme/abc"

    Args:
        labels: List of labels to be filtered.

    Returns: Filtered labels as they are calculated.
    """
    # check explicit labels
    for label in labels:
        if normalize_label(label) in self._explicit_labels:
            yield label

    # check wildcard labels
    if self._label_regexes:
        for label in labels:
            if self._matches_any_regex(label):
                yield label

create_from_config(config_labels) classmethod

Loads labels defined by the user in the config.

Parameters:

Name Type Description Default
config_labels list[str]

Labels configured by the users.

required
Source code in src/nemo_safe_synthesizer/pii_replacer/ner/labels.py
@classmethod
def create_from_config(cls, config_labels: list[str]) -> LabelEvaluator:
    """
    Loads labels defined by the user in the config.

    Args:
        config_labels: Labels configured by the users.
    """
    explicit_labels = set([])
    label_regexes: list[Pattern] = []

    for label in config_labels:
        if "*" not in label:
            explicit_labels.add(label)
        else:
            # there is a wildcard
            parts = label.split("/")
            if len(parts) == 2:
                namespace, entity = parts
                # match all labels inside a namespace
                label_regexes.append(re.compile(rf"^{namespace}/.+$", re.IGNORECASE))

            elif len(parts) == 1:
                # match all labels that don't have namespace
                label_regexes.append(re.compile(r"^[^/]+$", re.IGNORECASE))

            else:
                logger.warning(f"Invalid label specification '{label}'. Skipping.")

    return cls(explicit_labels, label_regexes)