Specifies probabilistic models in PyMC, runs MCMC/VI, diagnoses convergence, interprets posteriors, and performs posterior predictive checks
Workflow for building, sampling, diagnosing, and reporting Bayesian probabilistic models using PyMC.
Write out the model mathematically before coding, using LaTeX dollar notation
($...$ for inline, $$...$$ for display equations):
Likelihood:
$$\mu_i = \alpha + \beta_1 x_{1i} + \beta_2 x_{2i}$$
$$y_i \sim \mathcal{N}(\mu_i,; \sigma)$$
Priors:
$$\alpha \sim \mathcal{N}(0, 10), \qquad \beta_1, \beta_2 \sim \mathcal{N}(0, 2), \qquad \sigma \sim \text{HalfNormal}(5)$$
For each parameter:
pm.do_prior_predictive_checks() to visualise implied outcomes.import pymc as pm
import numpy as np
import matplotlib.pyplot as plt
with pm.Model() as model:
alpha = pm.Normal("alpha", mu=0, sigma=10)
beta = pm.Normal("beta", mu=0, sigma=2, shape=2)
sigma = pm.HalfNormal("sigma", sigma=5)
mu = alpha + beta[0] * X[:, 0] + beta[1] * X[:, 1]
y_obs = pm.Normal("y_obs", mu=mu, sigma=sigma, observed=y)
# Prior predictive check
prior_pred = pm.sample_prior_predictive(samples=200)
fig, ax = plt.subplots()
ax.hist(prior_pred.prior_predictive["y_obs"].values.flatten(), bins=50)
ax.set_title("Prior predictive distribution"); ax.set_xlabel("y")
fig.savefig("figures/bayes_prior_predictive.png", dpi=150)
with model:
trace = pm.sample(
draws=2000,
tune=1000,
chains=4,
cores=4,
random_seed=42,
target_accept=0.9,
)
import arviz as az
# Numerical convergence diagnostics
summary = az.summary(trace, var_names=["alpha", "beta", "sigma"])
print(summary)
# R-hat < 1.01 and ESS_bulk > 400 required
# Trace plots
az.plot_trace(trace, var_names=["alpha", "beta", "sigma"])
plt.savefig("figures/bayes_trace.png", dpi=150)
# Energy plot
az.plot_energy(trace)
plt.savefig("figures/bayes_energy.png", dpi=150)
# Rank plots (MCMC mixing diagnostic)
az.plot_rank(trace, var_names=["alpha", "beta"])
plt.savefig("figures/bayes_rank.png", dpi=150)
Convergence thresholds:
# Posterior distribution plot
az.plot_posterior(trace, var_names=["beta"], hdi_prob=0.94)
plt.savefig("figures/bayes_posterior.png", dpi=150)
print(az.summary(trace, var_names=["beta"], hdi_prob=0.94))
Report: posterior mean, 94% HDI (Highest Density Interval).
with model:
ppc = pm.sample_posterior_predictive(trace)
az.plot_ppc(az.from_pymc(posterior_predictive=ppc, model=model))
plt.savefig("figures/bayes_ppc.png", dpi=150)
We specified a Bayesian linear regression with weakly informative priors
($\alpha \sim \mathcal{N}(0, 10)$, $\beta \sim \mathcal{N}(0, 2)$,
$\sigma \sim \text{HalfNormal}(5)$).
Posterior sampling used NUTS (4 chains × 2,000 draws; 1,000 warm-up).
All parameters converged ($\hat{R} < 1.01$; $\text{ESS}_\text{bulk} > 800$).
$\beta_1$ had a posterior mean of 0.43 (94% HDI: [0.28, 0.58]), providing
strong evidence of a positive association with the outcome.
Math notation: Always render mathematical quantities in LaTeX dollar notation. Use
$...$for inline expressions (e.g.,$\hat{R} < 1.01$,$\beta_1 = 0.43$) and$$...$$for display equations (e.g.,$$y_i \sim \mathcal{N}(\mu_i, \sigma)$$).