Tutorial: Classification + conformal intervals
Goal: classify with DEUP epistemic scores, then wrap them in calibrated prediction sets with finite-sample coverage.
Fit classifier
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from deup import DEUPClassifier
X, y = load_breast_cancer(return_X_y=True)
X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.3, random_state=0)
X_tr, X_cal, y_tr, y_cal = train_test_split(X_tr, y_tr, test_size=0.25, random_state=0)
clf = DEUPClassifier(
base_model=RandomForestClassifier(n_estimators=80, random_state=0),
cv=5,
random_state=0,
)
clf.fit(X_tr, y_tr)
Epistemic uncertainty
pred, unc = clf.predict(X_te, return_uncertainty=True)
proba = clf.predict_proba(X_te)
unc comes from g trained on out-of-fold log-loss (or Brier if configured).
Conformal calibration (separate cal split)
clf.calibrate(X_cal, y_cal, method="normalized", alpha=0.1)
result = clf.predict_interval(X_te)
result.lower, result.upper
method="normalized"scales intervals byg(x)— narrow where confidentmethod="mondrian"— per-group intervals (passgroups=if needed)- Target coverage: 1 − α (marginal, distribution-free)
See Conformal calibration for math and MAPIE interop.