predictor
predictor
¶
Classes:
| Name | Description |
|---|---|
ContextSpan |
This class can be used to search for surrounding context given an |
PredictorContext |
Base class for an arbitrary context object that can be |
Predictor |
Base class for managing an entity prediction. |
ContextSpan(pattern_list, span=DEFAULT_CONTEXT_SPAN_SIZE)
dataclass
¶
This class can be used to search for surrounding context given an input string and some start/end offsets within that string. You create this object by providing a list of discrete strings or regex patterns to match on, and then how far "left" and "right" of the target string to search for these patterns.
In the below example we'll search for context left and right of a phone number::
tgt = "Please give me a call at 867-5309"
We can create a ContextSpan to use the "call" string as context::
c = ContextSpan(pattern_list=["call"])
assert c.is_match(tgt, 25, 33)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pattern_list
|
list[str | Pattern]
|
A list of strings or regex Patterns to use for matching |
required |
span
|
int
|
How many characters left of the start index and right of the end index
to search for any matches from the |
DEFAULT_CONTEXT_SPAN_SIZE
|
PredictorContext()
dataclass
¶
Bases: ABC
Base class for an arbitrary context object that can be
passed into a predictor. Arbitrary contexts can be subclassed
from here and passed into the Predictor objects.
This can be useful when predictors should have the same business logic but perhaps some differing settings like contexts, etc
Predictor(name, namespace=None, predictor_context=None)
¶
Bases: ABC
Base class for managing an entity prediction.
Predictors operate at the record level and might
be managed via a PredictionPipeline parent class.
For a NLP pipeline this might represent a model. In pattern
based pipelines a Predictor might represent a single
entity matcher such as an IP address.
Methods:
| Name | Description |
|---|---|
evaluate |
This MUST be implemented by each Predictor |
header_has_context |
Checks to see if the field has a label match. |
Attributes:
| Name | Type | Description |
|---|---|---|
default_name |
str
|
Subclasses can set a default name to use that |
Source code in src/nemo_safe_synthesizer/pii_replacer/ner/predictor.py
default_name = None
class-attribute
instance-attribute
¶
Subclasses can set a default name to use that can be directly accessed as a class attr if need be.
evaluate(in_data)
abstractmethod
¶
header_has_context(field_pair, header_context_source, token_patterns=None, regex_patterns=None)
¶
Checks to see if the field has a label match.