Machine Learning & Feature Engineering for Fantasy Football | Skills Pool
Skill File
Machine Learning & Feature Engineering for Fantasy Football
Expert guidance on machine learning and feature engineering for fantasy football player projection models. Use this skill when building predictive models, engineering features from player statistics, selecting appropriate ML algorithms, or addressing sports-specific ML challenges. Covers feature engineering patterns, model selection frameworks, validation strategies, and interpretability techniques for fantasy football analytics.
zazu-220 starsNov 10, 2025
Occupation
Categories
Finance & Investment
Skill Content
Overview
Provide expert guidance on building ML-based player projection models using research-backed feature engineering patterns, appropriate model selection, and sports-specific validation strategies. Apply domain expertise to help design features, choose models, avoid common pitfalls, and create interpretable predictions.
When to Use This Skill
Trigger this skill for queries involving:
Feature engineering: "What features should I include?" "How do I create age curve features?" "What are good opportunity metrics?"
Model selection: "Which ML model should I use?" "Random Forest or XGBoost?" "When to use regularized regression?"
Validation strategies: "How do I validate sports models?" "What's wrong with standard cross-validation?" "How to avoid data leakage?"
Sports-specific challenges: "How to handle small sample sizes?" "How to model position differences?" "Handling regime changes?"
Feature selection: "How to reduce 109 stats to key features?" "Lasso vs Ridge?" "How to handle multicollinearity?"
Related Skills
Model interpretability: "How to explain predictions?" "What features matter most?" "SHAP values for fantasy?"
Note: For dynasty strategy questions (player valuation, trade analysis, roster construction), use ff-dynasty-strategy. For statistical methods (regression types, simulations, GAMs), use ff-statistical-methods.
Core Capabilities
1. Feature Engineering
Core Principle: Feature engineering is more important than model selection for sports predictions.
Key Feature Categories:
Age Curves
Marcel system: 3-year weighted average + age adjustment + regression to mean
Position-specific peaks: RB 23-26, WR 26-28, QB 28-33, TE 26-29
Volume is king: opportunity metrics predict better than TDs
Efficiency Statistics
Yards per route run (YPRR), yards per carry (YPC)
Yards after contact (YAC), catch rate
Warning: Noisy with small samples, use rolling averages
Interaction Terms
QB quality × target share (receiver production context)
Opponent strength adjustments
Game script (leading = rushing, trailing = passing)
~40% of team performance from synergy effects
Rolling Averages
Last 3 games, last 5 games, season-long
Trend features: recent form vs established baseline
Lag features: last game, same opponent last season
Reference:references/feature_engineering.md for formulas, implementation patterns, and common mistakes.
2. Model Selection
Decision Framework:
Primary Goal?
├─ Interpretability → Linear/Ridge/Lasso Regression
└─ Performance
├─ Small (<1000) → Ridge/Lasso/Elastic Net
├─ Medium (1K-10K) → Random Forest or XGBoost
└─ Large (>10K) → XGBoost/LightGBM or Ensemble
Model Types:
Linear Regression: Baseline, interpretability, small samples