Skip to content

API: Diagnostics

Aggregation reliability

Estimate whether an aggregated mean(g) context signal is trustworthy.

Parameters:

Name Type Description Default
min_effective_n float

Effective-N threshold below which aggregation is flagged untrustworthy. Defaults to 200 -- comfortably above the N ~ 50 regime that scored near-chance and well below the N ~ 10,000 regime that saturated, while leaving headroom for the autocorrelation discount.

200.0
max_autocorr float

Median within-context lag-1 autocorrelation above which dependence is judged high enough to undermine the i.i.d. assumption.

0.2
Notes

The thresholds are conservative defaults derived from the empirical reference points (see :data:REFERENCE_POINTS); tune them for your domain. This guard is the explicit remedy for silently exposing context_uncertainty = mean(g).

analyze(g, groups)

Compute per-context N_eff / autocorrelation and a trustworthiness verdict.

Parameters:

Name Type Description Default
g ArrayLike

Per-item epistemic estimate g(x_i).

required
groups ArrayLike

Per-item context label (e.g. date). Items within a group are assumed to be in natural (temporal) order.

required

aggregate(g, groups, *, warn=True)

Return (context_labels, mean_g_per_context, verdict).

Emits a :class:UserWarning when the aggregate is judged untrustworthy (unless warn=False). This is the guarded alternative to a bare mean(g) API.

Outcome of an aggregation-reliability check.

Autocorrelation-discounted effective sample size.

Uses the standard lag-1 AR(1) inflation factor N_eff = N * (1 - rho) / (1 + rho) where rho is the lag-1 autocorrelation. Independent data (rho ~ 0) gives N_eff ~ N; strong positive dependence (rho -> 1) shrinks N_eff toward 1. The order of values matters: pass them in their natural (e.g. temporal) order.

Parameters:

Name Type Description Default
values ArrayLike

The within-context per-item signal in natural order.

required
n int | None

Override the raw count (defaults to len(values)).

None

Convenience: return a trust verdict (+ reason) for mean(g) aggregation.

Thin wrapper over :class:AggregationReliability for one-off checks.

Composite health index

Fuse complementary component signals into one context-reliability scalar.

Parameters:

Name Type Description Default
components list[tuple[str, ComponentFn]] | None

List of (name, fn) pairs. Each fn(idx, arrays) returns one scalar per context where higher = worse (more unhealthy). Defaults to the three signals from Finding 2 (realized loss, drift PSI, model disagreement); supply your own to extend or replace them.

None
weights ArrayLike | None

Optional per-component weights (defaults to equal). Length must match components.

None
threshold float

Health-score gating threshold in [0, 1]; contexts at or above it are "trustworthy / trade". Default 0.5.

0.5
Notes

Component values are z-scored across contexts (so heterogeneous scales combine sensibly), summed with weights into a "badness" score, then mapped to a [0, 1] health score via min-max with health = 1 - normalized_badness. This is the low-N/non-i.i.d. remedy and is intended to stay off the high-N i.i.d. default path (where individual-level g already saturates).

compute(groups, arrays)

Compute per-context health from the provided per-item arrays.

Parameters:

Name Type Description Default
groups ArrayLike

Per-item context labels.

required
arrays dict[str, ArrayLike]

Dict of per-item arrays needed by the components (e.g. loss, feature + feature_reference, disagreement). Reference arrays (keys ending in _reference) are passed through unindexed.

required

Per-context health scores and gating verdicts.

verdict(label)

Gate decision for a single context label.

Mean realized loss in the context (higher = worse). Requires arrays['loss'].

Population Stability Index of the context feature vs. a reference distribution.

Requires arrays['feature'] (per-item scalar feature) and arrays['feature_reference'] (1-D reference sample). Higher PSI = more drift.

Mean ensemble/model disagreement in the context. Requires arrays['disagreement'].