Skip to content

deup

Direct Epistemic Uncertainty Prediction for any scikit-learn model.

DEUP (Lahlou et al., 2023) estimates epistemic uncertainty by training a secondary error predictor on your model's out-of-sample errors. This library provides a maintained, installable, scikit-learn-compatible implementation with first-class support for time-series, cross-sectional ranking, conformal intervals, and aggregation-reliability diagnostics.

pip install deup
pip install "deup[finance,gbm]"   # pandas + LightGBM tabular backend
pip install "deup[xgb,catboost]" # optional XGBoost / CatBoost backends
from sklearn.ensemble import RandomForestRegressor
from deup import DEUPRegressor

model = DEUPRegressor(base_model=RandomForestRegressor())
model.fit(X_train, y_train)
pred, unc = model.predict(X_test, return_uncertainty=True)

Cross-sectional finance panel:

from deup.domains.finance import CrossSectionalDEUP

model = CrossSectionalDEUP(horizon=20).fit(panel_df)
pred, unc = model.predict(test_df, return_uncertainty=True)
health = model.health_report(test_df)

Why deup?

  • Works with models you already use — sklearn, LightGBM, any fit/predict API
  • Leakage-correct by default — out-of-fold errors (Algorithm 2), purged walk-forward
  • Time-series & rankingDEUPRanker, rank-geometry residualization, HealthIndex
  • Calibrated intervals — split-conformal predict_interval() + MAPIE interop
  • Benchmarked — DEUP beats ensembles/conformal on tabular; N-sweep validates aggregation guards

Documentation map

Topic Page
Quickstart Getting started
Five axes Concepts
Tutorials Tabular · Finance · Conformal · Active learning
Math & algorithms Theory & math
When is agg-g reliable? Aggregation reliability
Finance / vision presets Domain presets
PyTorch / TorchUncertainty PyTorch integration
Benchmarks & N-sweep Benchmarks

Attribution

DEUP the method is due to Lahlou, Jain, Nekoei, Butoi, Bertin, Rector-Brooks, Korablyov, and Bengio (2023, TMLR). Cross-sectional ranking, aggregation reliability, and two-level deployment diagnostics follow Sanderink (2026). Please cite the original DEUP paper, that work where relevant, and this software (CITATION.cff).

Status

v0.4.0 — estimators, conformal calibration, reliability diagnostics, domain presets (tabular with LightGBM/XGBoost/CatBoost, finance, vision), benchmark suite, tutorials, TorchUncertainty cross-link. See CHANGELOG.

Contributions welcome — see CONTRIBUTING.md.