Add a new Numerai model type to the agents training pipeline. Use when you need to register a model in `agents/code/modeling/utils/model_factory.py`, handle fit/predict quirks in `agents/code/modeling/utils/numerai_cv.py`, and update configs so the model can run via `python -m agents.code.modeling`.
Add a new model type so it can be selected in configs and trained/evaluated by the base pipeline.
Note: run commands from numerai/ (so agents is importable), or from repo root with PYTHONPATH=numerai.
Define the model API and output shape.
fit(X, y, sample_weight=...) and predict(X).agents/code/modeling/models/ so model-specific code stays isolated.Register the model constructor in agents/code/modeling/utils/model_factory.py.
if model_type == "XGBRegressor":
try:
from xgboost import XGBRegressor
except ImportError as exc:
raise ImportError(
"xgboost is required for XGBRegressor. Install with `.venv/bin/pip install xgboost`."
) from exc
return XGBRegressor(**model_params)
CONFIG = {
"model": {"type": "XGBRegressor", "params": {"n_estimators": 500}},
"training": {"cv": {"n_splits": 5}},
"data": {"data_version": "v5.2", "feature_set": "small", "target_col": "target", "era_col": "era"},
"output": {},
"preprocessing": {},
}
load_and_prepare_data in agents/code/modeling/utils/pipeline.py to pass extra columns into load_full_data..venv/bin/python -m agents.code.modeling --config <config_path>..venv/bin/python -m unittest.After validating the model implementation:
numerai-experiment-design skill to run multiple rounds of experiments (4–5 configs per round), then scale winners until you hit a plateau.numerai-model-upload skill to create a pkl file only after you have a stable, scaled “best model” you intend to deploy.numerai-model-upload skill for deployment workflow).