Use when implementing X-Trend or attention-based trading models. Covers LSTM encoders, cross-attention, self-attention, sequence representations, entity embeddings, Variable Selection Networks, encoder-decoder patterns, Deep Momentum Networks, and interpretable predictions for trend-following strategies.
Complete guide to implementing the X-Trend (Cross Attentive Time-Series Trend Network) architecture, combining LSTMs, attention mechanisms, and few-shot learning for trend-following strategies.
Activate this skill when:
Input: Target sequence x[t] + Context set C
↓
[Encoder]
- LSTM for sequences
- Entity embeddings
- Variable Selection Network (VSN)
- Self-attention over context
- Cross-attention: target ← context
↓
[Decoder]
- LSTM with encoder output
- Dual heads: Forecast + Position
- PTP (Predictive distribution To Position)
↓
Output: Trading position z[t] ∈ [-1, 1]
+ Forecast distribution (μ, σ) or quantiles
The model takes 8-dimensional feature vectors combining:
Normalization formula for returns:
r_hat[t-t', t] = r[t-t',t] / (σ[t] * sqrt(t'))
IMPORTANT: Use EWMA (exponentially weighted moving average) for volatility calculation:
volatility = prices.pct_change().ewm(span=60).std()
See IMPLEMENTATION.md for full code.
Learns to weight different input features dynamically:
v[j,t] = FFN_j(x[j,t])w[t] = softmax(FFN_weight(x[t]))VSN(x[t]) = Σ w[j,t] * v[j,t]Purpose: Automatically determines which features (returns vs MACD, short-term vs long-term) are most relevant at each time step.
See IMPLEMENTATION.md for PyTorch implementation.
Learn asset-specific representations:
Important: Exclude entity embeddings for zero-shot learning (unseen assets).
See IMPLEMENTATION.md for code.
LSTM-based encoder with skip connections:
Architecture Pattern: x → VSN → LSTM → (+skip) → LayerNorm → FFN(+entity) → (+skip) → LayerNorm
See IMPLEMENTATION.md for full implementation.
Target sequence attends to context sequences:
Attention(Q, K, V) = softmax(QK^T / √d) V