Apply statistical methods to financial data including descriptive statistics, covariance estimation, regression, hypothesis testing, and resampling. Use when the user asks about return distributions, correlation between assets, building a covariance matrix, running a CAPM regression, testing whether alpha is significant, checking if returns are normal, or estimating confidence intervals. Also trigger when users mention 'volatility', 'how correlated are these', 'fat tails', 'skewness', 'R-squared', 'beta of a fund', 'bootstrap a Sharpe ratio', 'shrinkage estimator', 'Ledoit-Wolf', or ask why their optimizer produces unstable weights.
This skill enables Claude to apply core statistical methods to financial data, including descriptive statistics, covariance estimation, linear regression, hypothesis testing, and resampling techniques. These methods form the quantitative backbone for portfolio construction, risk measurement, and factor modeling.
0 — Mathematical Foundations
both
Mean (Expected Value):
$$\mu = E[X] = \frac{1}{n} \sum_{i=1}^{n} x_i$$
The arithmetic average of observed values. For financial returns, this represents the central tendency of the return distribution.
Variance:
Population variance:
$$\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \mu)^2$$
Sample variance (Bessel's correction):
$$s^2 = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2$$
Standard Deviation:
$$\sigma = \sqrt{\sigma^2}$$
In finance, standard deviation of returns is commonly called volatility. Annualized volatility from monthly data: sigma_annual = sigma_monthly * sqrt(12).
Skewness:
$$\gamma = \frac{E[(X - \mu)^3]}{\sigma^3}$$
Measures asymmetry of the distribution. Negative skewness (left tail) is common in equity returns and indicates a higher probability of large losses than large gains.
Excess Kurtosis:
$$\kappa = \frac{E[(X - \mu)^4]}{\sigma^4} - 3$$
Measures tail thickness relative to the normal distribution (which has excess kurtosis of 0). Financial returns typically exhibit positive excess kurtosis (leptokurtosis), meaning fat tails and more frequent extreme events than a normal distribution would predict.
Covariance:
$$\text{Cov}(X, Y) = E[(X - \mu_X)(Y - \mu_Y)]$$
Sample covariance:
$$\hat{\text{Cov}}(X, Y) = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})$$
Covariance measures the linear co-movement between two variables. Positive covariance means they tend to move together; negative means they move inversely.
Correlation (Pearson):
$$\rho(X, Y) = \frac{\text{Cov}(X, Y)}{\sigma_X \times \sigma_Y}$$
Correlation normalizes covariance to the range [-1, +1], making it unit-free and comparable across variable pairs.
For a set of p assets with n return observations, the sample covariance matrix is:
$$\hat{\Sigma} = \frac{1}{n-1} (X - \bar{X})^T (X - \bar{X})$$
where X is the n x p matrix of returns.
The curse of dimensionality: When p (number of assets) is large relative to n (number of observations), the sample covariance matrix becomes poorly conditioned or singular, leading to unstable portfolio optimizations.
Shrinkage blends the sample covariance matrix with a structured target (e.g., the identity matrix scaled by average variance) to produce a more stable estimate:
$$\hat{\Sigma}_{shrunk} = \delta \cdot F + (1 - \delta) \cdot \hat{\Sigma}$$