Overview

This skill covers three tightly coupled concerns for base model training:

Model training — stateless trainer architecture for CatBoost/LGB/XGB/NN; correct params by task type; training objective vs eval metric separation; auxiliary data safety rules
Competition metrics — competition_score single source of truth; per-framework wrappers; the CatBoost logit vs prediction API difference that silently destroys scores
Output format — metric determines prediction type; deriving submission format from sample_submission.csv; OOF collection patterns

The two most expensive silent bugs:

Wrong eval metric in early stopping → model stops at wrong iteration → wasted training
Submitting class labels when metric expects probabilities → 0.91 AUC model scores ~0.5

Task Type Decision Guide

Identify your task type first — it determines which parameters, objectives, and techniques apply.

This skill covers three tightly coupled concerns for base model training:

Model training — stateless trainer architecture for CatBoost/LGB/XGB/NN; correct params by task type; training objective vs eval metric separation; auxiliary data safety rules
Competition metrics — competition_score single source of truth; per-framework wrappers; the CatBoost logit vs prediction API difference that silently destroys scores
Output format — metric determines prediction type; deriving submission format from sample_submission.csv; OOF collection patterns

The two most expensive silent bugs:

Wrong eval metric in early stopping → model stops at wrong iteration → wasted training
Submitting class labels when metric expects probabilities → 0.91 AUC model scores ~0.5

Identify your task type first — it determines which parameters, objectives, and techniques apply.

Binary classification	XGB: `binary:logistic` · LGB: `binary` · CB: `CatBoostClassifier`	— (all apply)
Regression	XGB: `reg:squarederror` · LGB: `regression` · CB: `CatBoostRegressor`	`scale_pos_weight`, `is_unbalance`, `auto_class_weights: Balanced`, threshold=0.5
Multiclass	XGB: `multi:softprob` · LGB: `multiclass` · CB: `MultiClass` loss	`scale_pos_weight`, `is_unbalance`
Multi-label	N independent binary models OR single NN with N sigmoid heads	— (each target is binary)
Ranking	XGB: `rank:pairwise` · LGB: `lambdarank` · CB: `YetiRank`	all imbalance params

	Tree models (CB/LGB/XGB)	Neural Network
Training objective	Built-in BCE / squared-error (internal to framework)	Custom loss: `FocalLoss`, `SmoothBCE`, `MSELoss`, `BCELoss`
Eval metric	Custom competition metric wrapper	Competition score computed per epoch
Early stopping driven by	Eval metric	Competition score

Framework	Metric injection	Input type in callback
CatBoost binary	`eval_metric=CatBoostCompMetric()`	Raw logits → apply `sigmoid`
CatBoost regression	`eval_metric=CatBoostCompMetric()`	Raw predictions → use directly
CatBoost multiclass	`eval_metric=CatBoostCompMetric()`	K logit arrays → apply `softmax`
LightGBM	`"metric": "None"` + `feval=make_lgb_feval()`	Already-transformed probs/values
XGBoost	`"disable_default_eval_metric": 1` + `custom_metric=make_xgb_eval()` + `maximize=True`	Already-transformed probs/values
NN	Compute in epoch loop; `model.load_state_dict(best_state)`	You control activation in `forward()`

File	What it covers
model-training.md	CB/LGB/XGB/NN params by task type, trainer architecture, training objective vs eval metric
competition-metrics.md	`competition_score` pattern, per-framework metric wrappers (CB/LGB/XGB/NN), training losses
output-format.md	Metric → prediction type table, submission format by task, OOF collection patterns, scout checklist

Skill	When to use it instead
`ml-competition`	Full pipeline overview, task type decision guide, first-principles checklist
`ml-competition-setup`	Project structure, RunConfig, process management
`ml-competition-features`	Feature engineering, validation strategy
`ml-competition-tuning`	Optuna hyperparameter tuning
`ml-competition-advanced`	Pseudo-labeling, ensemble, post-processing, experiment tracking
`ml-competition-quality`	Coding rules, common pitfalls