CAISO like-day / analogue-day morning briefing. Finds historical days matching today's fundamental signals (wind, solar, gas, RT-DA spread, imports, price shape) and estimates where NP15/SP15 DA on-peak will clear tomorrow. Use when: "like day", "analogue day", "similar day", "morning briefing", "what should DA clear at", "CAISO morning analysis", "like day analysis", "find me days like today", "what do we expect for tomorrow", "run the morning report", "daily CAISO brief", "comparable days", "price estimate for tomorrow", "NP15 SP15 forecast". This skill runs the full workflow: fundamentals snapshot, multi-signal historical search, two-day pair analysis, nodal congestion patterns, DA-RT opportunity identification, and tomorrow's price estimate.
Every morning, run a comprehensive like-day analysis for CAISO. Find historical analogue days matching today's fundamental regime — not just price shape, but wind/solar generation, gas prices, RT-DA spreads, regional import levels (NW/MidC and Desert Southwest paths), and OASIS 7-day forecasts. Then use two-day pairs where the D→D+1 transition matches today→tomorrow's forecasted conditions to estimate where NP15 and SP15 DA on-peak will clear tomorrow. Also identify nodal congestion patterns and DA-RT virtual bidding opportunities.
The key enhancement: use OASIS 7-day load/wind/solar forecasts and scheduled regional imports to characterize tomorrow's expected supply stack, then search history for two-day sequences where (D actuals ≈ today) AND (D+1 actuals ≈ tomorrow's forecast). This is more powerful than just looking for "D+1 with higher renewables."
Database: rfo_analytics (AWS Athena)
Query execution: AWS CLI (aws athena start-query-execution)
Output S3: s3://aws-athena-query-results-767397821604-us-east-1/claude-queries/
Execute these steps in order. Run independent queries in parallel where possible.
today = current date (user's local or system date)
Determine DST vs standard time:
Set TZ_OFFSET accordingly (use INTERVAL '7' HOUR for DST, INTERVAL '8' HOUR for standard).
HE1 PPT in UTC = today at 08:00 UTC (DST) or 09:00 UTC (standard).
Full flow day in UTC = today at HE1_UTC through tomorrow at HE1_UTC.
Run these queries in parallel:
SELECT
DATE(datetime - INTERVAL '7' HOUR) AS flow_date,
name, datatype,
CAST(EXTRACT(HOUR FROM (datetime - INTERVAL '7' HOUR)) + 1 AS INTEGER) AS he_ppt,
avgvalue
FROM rfo_analytics.rfo_prices_enriched
WHERE iso = 'CAISO'
AND name IN ('TH_NP15_GEN-APND', 'TH_SP15_GEN-APND')
AND datatype IN ('DALMP', 'RTLMP')
AND datetime BETWEEN TIMESTAMP '{today_utc_he1}' AND TIMESTAMP '{tomorrow_utc_he1}'
ORDER BY name, datatype, he_ppt
Compute from results:
SELECT DATE(datetime - INTERVAL '7' HOUR) AS flow_date,
AVG(avgvalue) AS wind_avg, MAX(avgvalue) AS wind_max
FROM rfo_analytics.rfo_generation_enriched
WHERE iso = 'CAISO' AND name = 'CAISO'
AND datatype = 'GENERATION_WIND_CURRENT'
AND datetime BETWEEN TIMESTAMP '{yesterday_utc_he1}' AND TIMESTAMP '{tomorrow_utc_he1}'
GROUP BY DATE(datetime - INTERVAL '7' HOUR)
ORDER BY 1
Same pattern as wind but datatype = 'GENERATION_SOLAR_CURRENT'.
SELECT DATE(datetime - INTERVAL '7' HOUR) AS flow_date, name, AVG(avgvalue) AS gas_price
FROM rfo_analytics.rfo_prices_enriched
WHERE iso = 'CAISO'
AND name IN ('Socal-Citygate', 'PG&E - Citygate')
AND datatype = 'NGI_GASPRICE_AVG'
AND datetime BETWEEN TIMESTAMP '{week_ago_utc}' AND TIMESTAMP '{tomorrow_utc_he1}'
GROUP BY DATE(datetime - INTERVAL '7' HOUR), name
ORDER BY name, flow_date
SELECT DATE(datetime - INTERVAL '7' HOUR) AS flow_date, datatype, AVG(avgvalue) AS avg_val
FROM rfo_analytics.rfo_flows_enriched
WHERE iso = 'CAISO' AND name = 'Caiso_Totals'
AND datetime BETWEEN TIMESTAMP '{week_ago_utc}' AND TIMESTAMP '{tomorrow_utc_he1}'
GROUP BY DATE(datetime - INTERVAL '7' HOUR), datatype
ORDER BY flow_date, datatype
SELECT name, CAST(EXTRACT(HOUR FROM (datetime - INTERVAL '7' HOUR)) + 1 AS INTEGER) AS he_ppt, avgvalue
FROM rfo_analytics.rfo_prices_enriched
WHERE iso = 'CAISO'
AND name IN ('TH_NP15_GEN-APND', 'TH_SP15_GEN-APND')
AND datatype = 'DALMP'
AND datetime BETWEEN TIMESTAMP '{tomorrow_utc_he1}' AND TIMESTAMP '{day_after_utc_he1}'
ORDER BY name, he_ppt
If tomorrow's DA is already published, report actual prices alongside the estimate.
Format as a table:
| Signal | Value |
|---|---|
| Wind avg | X MW |
| Solar avg | X MW |
| NP15 DA On-Peak | $X |
| SP15 DA On-Peak | $X |
| SP15 Midday Min | $X |
| NP15 DA Off-Peak | $X |
| NP15 RT-DA Off-Pk Spread | $X |
| SP15 RT-DA Off-Pk Spread | $X |
| PG&E Citygate Gas | $X |
| SoCal Citygate Gas | $X |
| Net Imports (IMP - EXP) | X MW |
Query all fundamental metrics for the trailing ~13 months (go back to previous year's spring). The lookback start should be approximately today - 400 days but at minimum cover the prior year's March.
-- Single large CTE query joining wind, solar, DA metrics, RT-DA spreads, gas
-- See the full query pattern in Step 2 but for the historical range
-- Filter: wind_avg > 2000 AND sp15_midday_min < -5
-- Return: flow_date, wind_avg, solar_avg, np_offpk, np_onpk, sp_offpk, sp_onpk,
-- sp_mid_min, sp_mid_avg, np_mid_min, np_mid_avg, np_flat, sp_flat,
-- np_rtda_offpk, sp_rtda_offpk, pge_gas, scg_gas
Use the same CTE-based query structure that joins wind_daily, solar_daily, da_metrics, rt_da_spread, and gas_daily. Filter for:
wind_avg > 2000sp_midday_min < -5HAVING COUNT(*) >= 20 in the DA metrics CTEFor each historical day, compute a multi-signal distance score:
| Signal | Weight | Normalization |
|---|---|---|
| Wind avg | 2 | / 1000 |
| SP15 on-peak | 3 | / 5 |
| NP15 on-peak | 2 | / 5 |
| SP15 midday min | 2 | / 10 |
| NP15 off-peak | 2 | / 10 |
| NP15 RT-DA off-peak | 2 | / 5 |
| SP15 RT-DA off-peak | 1.5 | / 5 |
| PG&E CG gas | 1.5 | / 1 |
| SoCal CG gas | 1 | / 1 |
Formula: raw_score = SUM(weight * |actual - today| / normalization)
Apply recency weighting — divide raw score by the recency multiplier:
| Days Ago | Multiplier |
|---|---|
| 0-35 | 5x |
| 36-100 | 3x |
| 101-190 | 2x |
| 191-340 | 1.5x |
| 341+ | 1x |
adj_score = raw_score / recency_multiplier
Present top 10 single-day matches in a table.
For each scored day D, check if D+1 exists in the dataset. Compute:
adj_score + (0 if renewables_up else 3)Sort by combined score. Take top 5-8 pairs.
For each pair, show:
For the D+1 dates from the top 5 pairs, query node-vs-hub LMP spreads during midday (HE10-16):
-- Join node LMP against SP15/NP15 hub LMP for interval_num 10-16
-- Target these specific nodes:
Premium nodes (consistently above hub during oversupply):
SLAP_PGCC-APND (Moss Landing / battery hub)MOSSLD_2_PSP1-APND, ELKHRN_1_EESX3-APND (Moss Landing units)SLUISP_2_UNITS-APND (San Luis Obispo area)SLAP_PGFG-APND (Geysers geothermal)SLAP_PGHB-APND (Humboldt Bay)DLAP_PGAE-APND (PG&E load)Discount nodes (depressed below hub — solar injection bottlenecks):
KETTLEMN_6_N001 (Kettleman Hills — Kern County solar belt)AVENAL_6_GN001 (Avenal — same area)POD_MNDOTA_1_SOLAR1-APND (Mendota solar — Fresno area)DC_ADLNTO-APND (Adelanto DC tie)DLAPs for load spread:
DLAP_PGAE-APND, DLAP_SCE-APND, DLAP_SDGE-APND, TH_ZP26_GEN-APNDPresent as a pivot table: nodes as rows, D+1 dates as columns, values = spread vs SP15 hub.
For the top 5 D+1 dates, pull full 24-hour DA and RT from rfo_prices_enriched:
SELECT flow_date, name, he_ppt,
MAX(CASE WHEN datatype='DALMP' THEN avgvalue END) AS da,
MAX(CASE WHEN datatype='RTLMP' THEN avgvalue END) AS rt,
MAX(CASE WHEN datatype='RTLMP' THEN avgvalue END) - MAX(CASE WHEN datatype='DALMP' THEN avgvalue END) AS dart
FROM ... GROUP BY flow_date, name, he_ppt
For each D+1 date, compute:
Summarize patterns:
Using the D+1 outcomes from the top 5 pairs, weighted by recency:
Present:
| Metric | Today | Tomorrow Estimate | Range | Confidence |
|---|---|---|---|---|
| NP15 On-Peak | $X | $Y | $A to $B | High/Med/Low |
| SP15 On-Peak | $X | $Y | $A to $B | |
| SP15 Midday Min | $X | $Y | $A to $B | |
| SP15 Midday Avg | $X | $Y | $A to $B | |
| NP15 Off-Peak | $X | $Y | $A to $B | |
| NP-SP Spread | $X | $Y | $A to $B |
Include a short rationale paragraph explaining:
Structure the output as:
Execute via AWS CLI:
aws athena start-query-execution \
--query-string "<SQL>" \
--query-execution-context Database=rfo_analytics \
--result-configuration OutputLocation=s3://aws-athena-query-results-767397821604-us-east-1/claude-queries/ \
--region us-east-1
Poll results:
aws athena get-query-results --query-execution-id <id> --region us-east-1
Run independent queries in parallel. The historical search (Step 4) is the heaviest query — submit it early.
These are exact column values — do not guess or abbreviate:
| Item | Exact Value |
|---|---|
| NP15 hub | name = 'TH_NP15_GEN-APND' |
| SP15 hub | name = 'TH_SP15_GEN-APND' |
| ZP26 hub | name = 'TH_ZP26_GEN-APND' |
| DA LMP datatype | datatype = 'DALMP' |
| RT LMP datatype | datatype = 'RTLMP' |
| Wind gen datatype | datatype = 'GENERATION_WIND_CURRENT' |
| Solar gen datatype | datatype = 'GENERATION_SOLAR_CURRENT' |
| Gas price datatype | datatype = 'NGI_GASPRICE_AVG' |
| PG&E Citygate | name = 'PG&E - Citygate' |
| SoCal Citygate | name = 'Socal-Citygate' |
| System gen name | name = 'CAISO' |
| Flow totals name | name = 'Caiso_Totals' |
| ISO filter | iso = 'CAISO' |
| On-peak hours | HE7-22 PPT |
| Off-peak hours | HE1-6, HE23-24 PPT |
| Midday hours | HE10-16 PPT |
All rfo_*_enriched timestamps are UTC. Convert to PPT using datetime - INTERVAL '7' HOUR (DST) or INTERVAL '8' HOUR (standard).
DA market in caiso_nodal_prices: interval_num 1-24 = HE1-24 PPT.
RTPD in caiso_nodal_prices: varies (may be sparse for historical dates).
These are the exact intertie name values for datatype = 'TRANS_USAGE_ENE_IMPORT_MW':
| Name | Description |
|---|---|
MALIN500_ISL | Malin 500kV — COB AC path (OR/CA border). Largest single NW path. |
NOB_ITC | North of Border — PDCI DC line terminus |
COTPISO_ITC | Captain Jack / COB tie-in |
NW Total = MALIN500_ISL + NOB_ITC + COTPISO_ITC
| Name | Description |
|---|---|
PALOVRDE_ITC | Palo Verde — primary AZ/CA interchange |
ELDORADO_ITC | Eldorado — NV/CA near Hoover Dam |
MEAD_ITC | Mead — near Hoover Dam |
NORTHGILA500_ITC | North Gila 500kV — AZ/CA (often export, subtract) |
PARKER_ITC | Parker Dam |
WSTWGMEAD_ITC | Westwing-Mead |
MEADMKTPC_ITC | Mead-Marketplace |
DSW Total = PALOVRDE + ELDORADO + MEAD + NORTHGILA + PARKER + WSTWGMEAD + MEADMKTPC
| Name | Description |
|---|---|
ADLANTOVICTVL-SP_ITC | Adelanto-Victorville (LADWP DC tie) |
IPPDCADLN_ITC | IPP DC to Adelanto (Utah import) |
TRANS_USAGE_ENE_IMPORT_MW values. Check for the next flow date to see what's scheduled.| Datatype | Description |
|---|---|
LOAD_FORECAST_7DAY | OASIS 7-day ahead load forecast. Available for ~8 days forward. |
LOAD_FORECAST | Current load forecast (similar range). |
LOAD_FORECAST_2DAY | 2-day ahead load forecast. |
DA_DEMAND_FORECAST | Day-ahead demand forecast. |
Use name = 'CAISO' for system-level.
| Datatype | Description |
|---|---|
ORIGINAL_WIND_FORECAST | OASIS original wind forecast. Available ~8 days forward. |
WIND_FORECAST | Current wind forecast. |
WIND_FORECAST_BIDCLOSE | Bid-close vintage wind forecast. |
ORIGINAL_SOLAR_FORECAST | OASIS original solar forecast. |
BIDCLOSE_SOLAR_FORECAST | Bid-close vintage solar forecast. |
Available names: CAISO (system), NP15, SP15, ZP26.
Instead of searching for D+1 with "higher renewables" generically:
The two-day pair scoring should now incorporate: