Deep-dive analysis of a single MLB player (hitter or pitcher) for the Yahoo Fantasy Baseball 2K25 league. Web-searches FanGraphs (ATC projections), Baseball Savant (xwOBA/xBA/xERA), MLB.com (lineups, probables), RotoWire (weather, injuries), and RotoBaller (closer depth) to produce the full set of structured player signals defined in the signal framework. Emits form_score, matchup_score, opportunity_score, daily_quality, regression_index, obp_contribution, sb_opportunity, role_certainty for hitters and qs_probability, k_ceiling, era_whip_risk, streamability_score, two_start_bonus, save_role_certainty for pitchers. Use when you need to analyze player, compute daily_quality, compute regression index, produce player signals, run a hitter analysis, run a pitcher analysis, or prep start/sit inputs for the lineup optimizer.
Scenario: Hitter analysis for Junior Caminero (TB 3B), today's opponent BOS, opp SP Brayan Bello (RHP), park Fenway, light wind.
Inputs assembled from web search:
Signal computation:
| Signal | Value | Quick read |
|---|---|---|
| form_score | 66 | rolling xwOBA 15% above season baseline |
| matchup_score | 58 | decent park, neutral SP, slight wind-aided |
| opportunity_score | 78 | #3 slot, ~4.6 expected PAs |
| daily_quality | 66 | START-tier (>=60) |
| regression_index | +15 | unlucky, buy-window |
| obp_contribution | 62 | projected .355 OBP x 4.6 PAs |
| sb_opportunity | 35 | Bello holds runners average, BOS catcher CS 28%, Caminero sprint 26.5 ft/s |
| role_certainty | 100 | confirmed lineup posted |
Recommendation to lineup-optimizer: daily_quality = 66 -> START. regression_index = +15 suggests no need to sit on any recent cold-streak noise.
Pitcher counter-example (pitcher start): Bowden Francis (TOR) at COL. daily_quality replaced by streamability_score. Coors kills streamability_score regardless of raw stuff; skill would emit qs_probability ~28, k_ceiling ~40, era_whip_risk ~82 -> streamability_score ~32 (sub-70 threshold -> SIT / DO NOT STREAM).
Copy this checklist and track progress:
MLB Player Analysis Progress:
- [ ] Step 1: Classify player (hitter vs pitcher; SP vs RP)
- [ ] Step 2: Collect season + 15-day performance (Savant, FanGraphs)
- [ ] Step 3: Collect today's context (opp SP/hitters, park, weather, lineup)
- [ ] Step 4: Compute normalized component scores
- [ ] Step 5: Compute composite signals (daily_quality or streamability_score)
- [ ] Step 6: Check regression_index and role_certainty
- [ ] Step 7: Validate against rubric and emit signal file
Step 1: Classify player
Determine role: hitter (any position player), SP (starter), RP (reliever, closer or setup). The signal set is different per role. See resources/methodology.md for role determination rules when a player has dual eligibility (two-way player, opener + bulk).
Step 2: Collect performance data
Web-search the primary sources. Every URL goes in the signal file's source_urls: list.
confidence: 0.3, and note the gap in the red-team fieldSee resources/data-cheatsheet for exact URL patterns.
Step 3: Collect today's context
park_hitter_factor / park_pitcher_factor)weather_risk)Step 4: Compute normalized component scores
All raw stats are converted to 0-100 (unipolar) or +/-100 (bipolar) per the signal framework. See resources/methodology.md for each formula.
Step 5: Compute composite signals
daily_quality = 0.35 * form_score + 0.40 * matchup_score + 0.25 * opportunity_scorestreamability_score = 0.40 * qs_probability + 0.30 * k_ceiling + 0.30 * (100 - era_whip_risk)two_start_bonus (bool from FantasyPros two-start page)Step 6: Check regression and role
regression_index = clamp((xwOBA - wOBA) * 500, -100, +100). Positive = unlucky (buy). Negative = lucky (sell / fade).role_certainty (hitter): 100 = confirmed in today's lineup, 70 = probable per beat reporter, 40 = platoon uncertain, 0 = benched or injuredsave_role_certainty (RP): 100 = locked closer per RotoBaller, 50 = timeshare, 20 = 7th-inning guyStep 7: Validate and emit
confidence and at least one source_urlmlb-signal-emitter (validation); on failure, log to tracker/decisions-log.mdPattern 1: Hot Streak Hitter (Sell-the-News)
form_score high (>=70), regression_index negative (e.g., -25)Pattern 2: Cold Hitter with Loud Contact (Buy-Window)
form_score depressed, regression_index positive (>=+20), barrel% still goodPattern 3: Two-Start Pitcher in a Bad Park
two_start_bonus = true, but one start has era_whip_risk >= 70Pattern 4: Closer in Committee / Role Uncertainty
save_role_certainty <= 50, k_ceiling decent, era_whip_risk lowCite every fact. Every numeric input (xwOBA, projected PAs, park factor, CS%) must trace to a URL in source_urls:. Unsourced claims fail the rubric's Source Citation criterion.
OBP matters more than AVG for this league. Our batting cats are R/HR/RBI/SB/OBP (not AVG). When computing obp_contribution and when choosing which rate stat to weight in form_score, use OBP or wOBA (which is walk-inclusive), never AVG alone. Walk rate is a feature, not a footnote.
QS matters more than W for this league. For qs_probability, compute the probability of 6+ IP and <=3 ER, not the probability of a win. Ignore bullpen-game starters and openers -- they score zero QS points by definition.
Use ATC projections, not Steamer alone. FanGraphs ATC is the consensus ensemble and is the most accurate single source. Steamer and ZiPS can be consulted for triangulation but do not substitute ATC without noting it.
Degrade gracefully on search failure. If a source is unreachable, do not invent numbers. Set that component's confidence to 0.3 and record the gap in the red-team note field. The red-team pass will escalate if confidence < 0.4.
Do not re-derive matchup-analyzer signals. If signals/YYYY-MM-DD-matchup.md exists for today's game, consume opp_sp_quality, park_hitter_factor, park_pitcher_factor, weather_risk, bullpen_state directly. Re-deriving wastes runtime and risks inconsistency across agents.
Timestamp every signal. computed_at: YYYY-MM-DDTHH:MMZ. Morning-brief calls are fresh; afternoon re-checks (once lineups post) supersede the morning signal with higher role_certainty.
Range-check every number. 0-100 signals never exceed 100 or go negative. +/-100 signals (regression_index) are clamped. The mlb-signal-emitter validator rejects out-of-range values -- check before calling.
Plain-English body. The frontmatter is for machines; the body must be jargon-free or translate jargon inline for the end user. "xwOBA" -> "expected offensive output based on how hard and at what angle he hit the ball, regardless of whether balls found gloves."
Composite formulas (see resources/methodology.md for derivations):
daily_quality = 0.35 * form_score + 0.40 * matchup_score + 0.25 * opportunity_score
streamability_score = 0.40 * qs_probability + 0.30 * k_ceiling + 0.30 * (100 - era_whip_risk)
regression_index = clamp((xwOBA - wOBA) * 500, -100, +100)
Action thresholds (feed to lineup-optimizer / streaming-strategist):
| Signal | START / STREAM | Neutral | SIT / FADE |
|---|---|---|---|
| daily_quality (hitter) | >= 60 | 45-59 | < 45 |
| streamability_score (SP) | >= 70 | 55-69 | < 55 |
| save_role_certainty (RP) | >= 70 | 40-69 | < 40 |
| regression_index | >= +25 (buy) | -24..+24 | <= -25 (sell) |
Source priority (always try in this order):
| Need | Primary | Fallback |
|---|---|---|
| Projections | FanGraphs ATC | Steamer, ZiPS, FantasyPros |
| Statcast / xwOBA / xERA | Baseball Savant | -- (no substitute) |
| Lineup / probable SP | MLB.com | RotoWire, FanGraphs Roster Resource |
| Park factor | FanGraphs park factors | Baseball-Reference park factors |
| Weather | RotoWire weather forecast | Google weather + MLB.com game page |
| Closer depth | RotoBaller closer charts | Pitcher List, Closer Monkey |
| Two-start week | FantasyPros two-start planner | FanGraphs probables grid |
Key resources:
Inputs required:
Outputs produced:
signals/YYYY-MM-DD-player-<lastname>-<firstinitial>.md (one file per player analyzed per day)