PyStan: A Second Intermediate Tutorial of Bayesian Analysis in Python

I promised a while ago that I’d give a more advanced tutorial of using PySTAN and Python to fit a Bayesian hierarchical model. Well, I’ve been waiting for a while because the paper was in review and then in print. Now, it’s out and I’m super excited! My first pure Python paper, using Python for all data manipulation, analysis, and plotting.

The question was whether temperature affects herbivory by insects in any predictable way. I gathered as many insect species as I could and fed them whatever they ate at multiple temperatures. Check the article for more detail, but the idea was to fit a curve to all 21 herbivore-plant pairs as well as to estimate the overall effect of temperature. We also suspected (incorrectly as it turns out) that plant nutritional quality might be a good predictor of the shape of these curves, so we included that as a group-level predictor.

Anyway, here’s the code, complete with STAN model, posterior manipulations, and some plotting. First, here’s the actual STAN model. NOTE: a lot of data manipulation and whatnot is missing. The point is not to show that but to describe how to fit a STAN model and work with the output. Anyone who wants theĀ full code and data to work with can find it on my website or in Dryad (see the article for a link).

stanMod = """
int<lower = 1> N;
int<lower = 1> J;
vector[N] Temp;
vector[N] y;
int<lower = 1> Curve[N];
vector[J] pctN;
vector[J] pctP;
vector[J] pctH20;
matrix[3, 3] R;

vector[3] beta[J];
real mu_a;
real mu_b;
real mu_c;
real g_a1;
real g_a2;
real g_a3;
real g_b1;
real g_b2;
real g_b3;
real g_c1;
real g_c2;
real g_c3;
real<lower = 0> sigma[J];
cov_matrix[3] Tau;

transformed parameters{
vector[N] y_hat;
vector[N] sd_y;
vector[3] beta_hat[J];
// First, get the predicted value as an exponential curve
// Also make a dummy variable for SD so it can be vectorized
for (n in 1:N){
y_hat[n] <- exp( beta[Curve[n], 1] + beta[Curve[n], 2]*Temp[n] + beta[Curve[n], 3]*pow(Temp[n], 2) );
sd_y[n] <- sigma[Curve[n]];
// Next, for each group-level coefficient, include the group-level predictors
for (j in 1:J){
beta_hat[j, 1] <- mu_a + g_a1*pctN[j] + g_a2*pctP[j] + g_a3*pctH20[j];
beta_hat[j, 2] <- mu_b + g_b1*pctN[j] + g_b2*pctP[j] + g_b3*pctH20[j];
beta_hat[j, 3] <- mu_c + g_c1*pctN[j] + g_c2*pctP[j] + g_c3*pctH20[j];

y ~ normal(y_hat, sd_y);
for (j in 1:J){
beta[j] ~ multi_normal_prec(beta_hat[j], Tau);
mu_a ~ normal(0, 1);
mu_b ~ normal(0, 1);
mu_c ~ normal(0, 1);
g_a1 ~ normal(0, 1);
g_a2 ~ normal(0, 1);
g_a3 ~ normal(0, 1);
g_b1 ~ normal(0, 1);
g_b2 ~ normal(0, 1);
g_b3 ~ normal(0, 1);
g_c1 ~ normal(0, 1);
g_c2 ~ normal(0, 1);
g_c3 ~ normal(0, 1);
sigma ~ uniform(0, 100);
Tau ~ wishart(4, R);

# fit the model!
fit = pystan.stan(model_code=stanMod, data=dat,
 iter=10000, chains=4, thin = 20)

Not so bad, was it? It’s actually pretty straightforward.

After the model has been run, we work with the output. We can check traceplots of various parameters:

fit.plot(['mu_a', 'mu_b', 'mu_c'])
fit.plot(['g_a1', 'g_a2', 'g_a3'])
fit.plot(['g_b1', 'g_b2', 'g_b3'])
fit.plot(['g_c1', 'g_c2', 'g_c3'])

As a brief example, we can extract the overall coefficients and plot them:

mus = fit.extract(['mu_a', 'mu_b', 'mu_c'])
mus = pd.DataFrame({'Intercept' : mus['mu_a'], 'Linear' : mus['mu_b'], 'Quadratic' : mus['mu_c']})

py.plot(mus.median(), range(3), 'ko', ms = 10)
py.hlines(range(3), mus.quantile(0.025), mus.quantile(0.975), 'k')
py.hlines(range(3), mus.quantile(0.1), mus.quantile(0.9), 'k', linewidth = 3)
py.axvline(0, linestyle = 'dashed', color = 'k')
py.xlabel('Median Coefficient Estimate (80 and 95% CI)')
py.yticks(range(3), ['Intercept', 'Exponential', 'Gaussian'])
py.ylim([-0.5, 2.5])
py.title('Overall Coefficients')

The resulting plot:


We can also make a prediction line with confidence intervals:

#first, define a prediction function
def predFunc(x, v = 1):
 yhat = np.exp( x[0] + x[1]*xPred + v*x[2]*xPred**2 )
 return pd.Series({'yhat' : yhat})

# next, define a function to return the quantiles at each predicted value
def quantGet(data , q):
 quant = []
 for i in range(len(xPred)):
 val = []
 for j in range(len(data)):
 val.append( data[j][i] )
 quant.append( np.percentile(val, q) )
 return quant

# make a vector of temperatures to predict (and convert to the real temperature scale)
xPred = np.linspace(feeding_Final['Temp_Scale'].min(), feeding_Final['Temp_Scale'].max(), 100)
realTemp = xPred * feeding_Final['Temperature'].std() + feeding_Final['Temperature'].mean()

# make predictions for every chain (in overall effects)
ovPred = mus.apply(predFunc, axis = 1)

# get lower and upper quantiles
ovLower = quantGet(ovPred['yhat'], 2.5)
ovLower80 = quantGet(ovPred['yhat'], 10)
ovUpper80 = quantGet(ovPred['yhat'], 90)
ovUpper = quantGet(ovPred['yhat'], 97.5)

# get median predictions
ovPred = predFunc(mus.median())

Then, just plot the median (ovPred) and the quantiles against temperature (realTemp). With just a little effort, you can wind up with something that looks pretty good:


I apologize for only posting part of the code, but the full script is really long. This should serve as a pretty good start for anyone looking to use Python as their Bayesian platform of choice. Anyone interested can get the data and full script from my article or website and give it a try! It’s all publicly available.