deup
Direct Epistemic Uncertainty Prediction for any scikit-learn model.
DEUP (Lahlou et al., 2023) estimates epistemic uncertainty by training a secondary error predictor on your model's out-of-sample errors. This library provides a maintained, installable, scikit-learn-compatible implementation with first-class support for time-series, cross-sectional ranking, conformal intervals, and aggregation-reliability diagnostics.
pip install deup
pip install "deup[finance,gbm]" # pandas + LightGBM tabular backend
pip install "deup[xgb,catboost]" # optional XGBoost / CatBoost backends
from sklearn.ensemble import RandomForestRegressor
from deup import DEUPRegressor
model = DEUPRegressor(base_model=RandomForestRegressor())
model.fit(X_train, y_train)
pred, unc = model.predict(X_test, return_uncertainty=True)
Cross-sectional finance panel:
from deup.domains.finance import CrossSectionalDEUP
model = CrossSectionalDEUP(horizon=20).fit(panel_df)
pred, unc = model.predict(test_df, return_uncertainty=True)
health = model.health_report(test_df)
Why deup?
- Works with models you already use — sklearn, LightGBM, any
fit/predictAPI - Leakage-correct by default — out-of-fold errors (Algorithm 2), purged walk-forward
- Time-series & ranking —
DEUPRanker, rank-geometry residualization,HealthIndex - Calibrated intervals — split-conformal
predict_interval()+ MAPIE interop - Benchmarked — DEUP beats ensembles/conformal on tabular; N-sweep validates aggregation guards
Documentation map
| Topic | Page |
|---|---|
| Quickstart | Getting started |
| Five axes | Concepts |
| Tutorials | Tabular · Finance · Conformal · Active learning |
| Math & algorithms | Theory & math |
| When is agg-g reliable? | Aggregation reliability |
| Finance / vision presets | Domain presets |
| PyTorch / TorchUncertainty | PyTorch integration |
| Benchmarks & N-sweep | Benchmarks |
Attribution
DEUP the method is due to Lahlou, Jain, Nekoei, Butoi, Bertin, Rector-Brooks,
Korablyov, and Bengio (2023, TMLR). Cross-sectional ranking, aggregation reliability,
and two-level deployment diagnostics follow Sanderink (2026). Please cite the original
DEUP paper, that work where relevant, and this software
(CITATION.cff).
Status
v0.4.0 — estimators, conformal calibration, reliability diagnostics, domain presets (tabular with LightGBM/XGBoost/CatBoost, finance, vision), benchmark suite, tutorials, TorchUncertainty cross-link. See CHANGELOG.
Contributions welcome — see CONTRIBUTING.md.