Skip to content

stats

stats

Lightweight running statistics using Welford's online algorithm.

Classes:

Name Description
Statistics

Container for basic statistical measures.

RunningStatistics

Incrementally compute mean and variance using Welford's online algorithm.

Statistics(count=0, mean=0, sum_sq_diffs=0, min=float('inf'), max=float('-inf')) dataclass

Container for basic statistical measures.

Tracks count, mean, min, max, and the sum of squared differences (for computing sample standard deviation). Intended as a base for RunningStatistics; use that subclass to incrementally accumulate values.

Attributes:

Name Type Description
count int

Number of values observed.

mean float

Running mean of observed values.

sum_sq_diffs float

Sum of squared differences from the mean (Welford's M2).

min int | float

Minimum observed value.

max int | float

Maximum observed value.

stddev float

Sample standard deviation, or 0 when fewer than two values have been observed.

count = 0 class-attribute instance-attribute

Number of values observed.

mean = 0 class-attribute instance-attribute

Running mean of observed values.

sum_sq_diffs = 0 class-attribute instance-attribute

Sum of squared differences from the mean (Welford's M2).

min = float('inf') class-attribute instance-attribute

Minimum observed value.

max = float('-inf') class-attribute instance-attribute

Maximum observed value.

stddev property

Sample standard deviation, or 0 when fewer than two values have been observed.

RunningStatistics(count=0, mean=0, sum_sq_diffs=0, min=float('inf'), max=float('-inf')) dataclass

Bases: Statistics

Incrementally compute mean and variance using Welford's online algorithm.

Allows statistics to be calculated on-the-fly without loading the entire dataset into memory. Call update for each new observation and read mean, stddev, min, max at any time.

Methods:

Name Description
update

Incorporate a new value x into the running statistics.

reset

Reset all statistics to their initial state.

update(x)

Incorporate a new value x into the running statistics.

Source code in src/nemo_safe_synthesizer/data_processing/stats.py
def update(self, x: int | float) -> None:
    """Incorporate a new value ``x`` into the running statistics."""
    self.count += 1
    self.min = min(self.min, x)
    self.max = max(self.max, x)
    new_mean = self.mean + (x - self.mean) * 1.0 / self.count
    new_var = self.sum_sq_diffs + (x - self.mean) * (x - new_mean)
    self.mean, self.sum_sq_diffs = new_mean, new_var

reset()

Reset all statistics to their initial state.

Source code in src/nemo_safe_synthesizer/data_processing/stats.py
def reset(self) -> None:
    """Reset all statistics to their initial state."""
    self.count = 0
    self.mean = 0
    self.sum_sq_diffs = 0
    self.min = float("inf")
    self.max = float("-inf")