Skip to content

API: Domain presets

Finance

Bases: BaseEstimator

Panel-data preset for cross-sectional stock rankers.

Defaults: PurgedWalkForward + rank residualization + finance g-features + HealthIndex for per-date context gating.

Parameters:

Name Type Description Default
base_model Any

Primary ranker (defaults to HGB regressor inside :class:DEUPRanker).

None
date_col str

Column holding the cross-section date / group label.

'date'
target_col str | None

Column holding the ranking target. If horizon is set, defaults to f"target_{horizon}d" when that column exists.

None
horizon int | None

Optional return horizon in days; selects target_{horizon}d when present.

None
feature_cols list[str] | None

Explicit g-feature columns; otherwise uses :data:FINANCE_G_FEATURES present in the panel.

None
cv Any

Walk-forward splitter settings (PurgedWalkForward when cv is int).

5
embargo Any

Walk-forward splitter settings (PurgedWalkForward when cv is int).

5
health_index HealthIndex | None

Context health scorer (default: three-component :class:HealthIndex).

None

fit(panel_df, y=None)

Fit on a long-format panel DataFrame.

predict_epistemic(panel_df)

Rank-residualized epistemic uncertainty per row.

calibrate(panel_df, *, alpha=0.1)

Conformal-calibrate on a held-out panel split (separate from fit).

health_report(panel_df)

Per-date composite context health (Finding 2 remedy).

rank_coupling_report(panel_df)

Diagnostic: rank-geometry coupling before/after residualization.

Add derived rank/regime columns expected by the finance g-feature preset.

Thesis-equivalent walk-forward g(x) on enriched residual panels.

Fits :class:~deup.core.error_estimator.ErrorEstimator with LightGBM on each expanding fold window — the direct migration of train_g_walk_forward.

Parameters:

Name Type Description Default
enriched Any

Panel with fold_id, horizon, g-features, and target_col.

required
fold_sort FoldSort

"numeric" (recommended) sorts fold_02 < fold_10 < fold_100; "string" reproduces legacy thesis lexicographic order for frozen parity.

'numeric'

Subset of :data:FINANCE_G_FEATURES present with sufficient non-null rate.

Tabular

Ergonomic tabular preset — delegates to core DEUP estimators.

Parameters:

Name Type Description Default
base_model Any

Primary predictor f. When None, chosen from backend.

None
backend BackendKind

"sklearn" (default), "lgbm", "xgb", or "catboost". Sets default base + error models when those are None.

'sklearn'
error_model Any

Secondary error predictor g. When None and backend is a GBM, uses the same family (e.g. LightGBM for backend="lgbm").

None
task TaskKind

"regression" (default) or "classification".

'regression'
cv Any

Forwarded / used as in the core estimator.

5
random_state Any

Forwarded / used as in the core estimator.

5
include_raw Any

Forwarded / used as in the core estimator.

5

backend property

Configured gradient-boosting / sklearn backend.

estimator property

Underlying core estimator (for advanced composition).

Default f model for a tabular backend.

Default g-features for i.i.d. tabular data: raw X + log-density.

Vision

Classification preset: embedding → density + variance → g(x).

Parameters:

Name Type Description Default
embedding BaseEstimator | Callable[[ArrayLike], ArrayLike] | None

Optional sklearn transformer or callable mapping raw inputs to embeddings. Defaults to :class:IdentityEmbedding (flatten tensors). Inputs are embedded once at the API boundary so the base classifier always sees 2-D arrays.

None
cv Any

KFold folds when int (i.i.d. vision batches).

5

Bases: BaseEstimator, TransformerMixin

Embed inputs, then append density + variance features (vision preset glue).

Bases: BaseEstimator, TransformerMixin

Flatten tensors to 2-D embeddings (test / baseline without a CNN).