Ensure Alpaca API is used for quality data, not yfinance fallback. Trigger when: (1) crypto volume filter fails unexpectedly, (2) zero-volume bars in data, (3) API key configuration issues.
| Item | Details |
|---|---|
| Date | 2024-12-28 |
| Goal | Fix crypto selection failing due to poor data quality |
| Environment | alpaca_trading/data/fetcher.py, training notebook |
| Status | Success - v2.5.0 now fails early |
User reported ALL 18 crypto symbols failing selection filters:
BTCUSD: failed price (max_price $10k but BTC is $87k)
ETHUSD: failed data_quality (51% zero-volume bars)
Others: failed volume, trading_status
Root Cause: Alpaca API keys weren't configured, so DataFetcher fell back to yfinance. yfinance crypto data has:
OLD BEHAVIOR (v2.4.x): Warned about yfinance but continued anyway, wasting time debugging filter failures.
NEW BEHAVIOR (v2.5.0): Training notebook FAILS IMMEDIATELY if crypto is enabled without Alpaca API:
# In training notebook cell-16
alpaca_enabled = data_fetcher._use_alpaca_data
if not alpaca_enabled and selection_config.crypto.enabled:
raise ValueError(
'CRYPTO TRAINING REQUIRES ALPACA API. '
'yfinance crypto data has ~50% missing volume. '
'Set API keys or disable crypto training.'
)
When crypto is enabled without Alpaca API:
======================================================================
ERROR: CRYPTO TRAINING REQUIRES ALPACA API
======================================================================
yfinance crypto data is unusable for training:
- ~50% zero-volume bars
- Causes all symbols to fail volume/data_quality filters
- Training on bad data produces bad models
FIX OPTIONS:
1. Set Alpaca API keys in Colab Secrets:
- APCA_API_KEY_ID = your_key
- APCA_API_SECRET_KEY = your_secret
2. Set API_KEYS_FILE in previous cell:
API_KEYS_FILE = "/content/Alpaca_trading/API_key_500Paper.txt"
3. Disable crypto and train only equities:
selection_config.crypto.enabled = False
selection_config.equities.enabled = True
| Attempt | Why it Failed | Lesson Learned |
|---|---|---|
| Lower volume consistency to 30% | Masks real problem, still fails | Fix data source, not thresholds |
| Skip data quality filter for crypto | Still fails price/trading_status | Garbage in = garbage out |
| Simplify all crypto filters | Works but produces bad models | Use quality data, not workarounds |
| Just warn about yfinance | User ignores warning, wastes time | FAIL FAST is better |
| Data Source | Volume Data | Zero-Volume Bars | Crypto Support |
|---|---|---|---|
| Alpaca API | Complete | <1% | Excellent |
| yfinance | 50% missing | ~50% | Unusable |
APCA_API_KEY_ID = your API keyAPCA_API_SECRET_KEY = your secret key# Cell 14: Option 1 - Environment variables (recommended)
# Keys are read from Colab Secrets automatically
# Cell 15: Option 2 - Keys file (after unzipping repo)
API_KEYS_FILE = '/content/Alpaca_trading/API_key_500Paper.txt'
# Add to ~/.bashrc or ~/.zshrc
export APCA_API_KEY_ID="your_key"
export APCA_API_SECRET_KEY="your_secret"
from alpaca_trading.data.fetcher import DataFetcher
fetcher = DataFetcher(keys_file=API_KEYS_FILE)
# This is checked BEFORE selection in v2.5.0
if not fetcher._use_alpaca_data:
print('Alpaca API: NOT AVAILABLE')
# Notebook will fail here if crypto enabled