Domain knowledge for the Evolution Engine — LLM-powered autonomous strategy discovery from raw OHLCV data. Covers the generate-backtest-select-evolve loop, vectorized backtesting, out-of-sample validation, and strategy graduation. Use when discovering trading patterns, running backtests, evolving strategies, or reviewing evolution logs. Triggers on "evolve", "discover patterns", "backtest", "evolution", "strategy generation", "candidate strategy".
The Evolution Engine autonomously discovers trading strategies from raw price data. It uses LLM-powered pattern generation combined with vectorized backtesting to evolve, test, and graduate viable trading rules — without manual rule writing.
This is not parameter optimization on a known strategy. It's open-ended strategy discovery: the LLM proposes novel entry/exit logic, the engine validates it against real data, and natural selection eliminates the losers.
OHLCV Data → LLM Generation → Vectorized Backtest → Selection → Mutation → Repeat
↓
Out-of-Sample Validation
↓
Graduated Strategies
| Tool | Purpose |
|---|---|
evolution_fetch_market_data | Fetch OHLCV data from Binance for a symbol/timeframe/period |
evolution_discover_patterns | LLM-powered pattern discovery — generates N candidate strategies |
evolution_run_backtest | Backtest a single candidate — returns Sharpe, win rate, drawdown |
evolution_evolve_strategy | Full evolution loop: generate → backtest → select → mutate × N generations |
evolution_get_log | History of evolution runs: graduated strategies, graveyard, metrics |
Every backtest produces:
| Metric | Minimum for Graduation |
|---|---|
| Sharpe Ratio | > 1.0 (OOS) |
| Win Rate | > 40% |
| Max Drawdown | < 25% |
| Number of Trades | > 30 (statistical significance) |
| Profit Factor | > 1.2 |
These thresholds are guidelines. Context matters — a Sharpe of 0.9 with 500 trades may be more reliable than 2.5 with 15 trades.
evolution_get_log to review what was tried, what failed, and why.| Mistake | Why It's Bad | Fix |
|---|---|---|
| Too few data points | Strategies overfit to noise | Use 90+ days for 1h, 180+ for 4h |
| Skipping OOS validation | In-sample Sharpe of 3.0 means nothing | Always validate on held-out data |
| Too many generations | Overfitting through excessive selection pressure | 3-5 generations is usually sufficient |
| Deploying immediately | No buffer for regime changes | Paper trade 2-4 weeks first |
| Ignoring the graveyard | Re-discovering dead strategies wastes compute | Review evolution_get_log before new runs |
| Using correlated symbols | BTCUSDT and ETHUSDT strategies overlap heavily | Test on uncorrelated markets |
ANTHROPIC_API_KEY — Required for LLM-powered pattern discovery