Posterior predictive plot in when using PyStan

Hello all,
I just need you assistance with using PyStan ( I am bit new to python)

While running a regression in stan , at the end I have a code :

generated quantities {

vector[T] y_predict;

for (t in 1:T) {
y_predict[t]  = student_t_rng(nu, mu[t]+X[t]*beta , sigma_y);

}

The idea here is to generate simulation of y_predict and plot its density along with the original y value. I think this is called posterior predictive check.

I can do this in R by :

y = unmet_data$Unmet_need_level_per_100
yrep1 ← extract(fit_model)[[“y_predict”]]
samp1000 ← sample(nrow(yrep1),1000)
ppc_dens_overlay(y, yrep1[samp1000,])

and I get a plot.

How can this be done in Python when I use PyStan?

Thanks for help in advance.

1 Like

You can use ArviZ

https://arviz-devs.github.io/arviz/examples/plot_ppc.html
https://arviz-devs.github.io/arviz/api/generated/arviz.plot_ppc.html

import arviz as az
idata = az.from_pystan(posterior=fit, posterior_predictive=["y_predict"], observed_data=["y"])
ax = az.plot_ppc(idata, data_pairs={"y": "y_predict"})
2 Likes

Hello Thank you. I am trying to run it but it is coming up with an error. Here is the full code:
model_code ="""

data {
int<lower=1> T; // Number of timesteps

vector[T] y; // observed value
int<lower=0> P;
matrix[T,P] X; // covariate

}

parameters {
real<lower=0> sigma_mu;
real<lower=0> sigma_mu_1;
real<lower=0> sigma_delta;
real<lower=0> sigma_delta_1;
real<lower=0> sigma_y;
vector[P] beta;
real<lower=0> nu;
vector[T] mu;
vector[T] delta;

}

transformed parameters {

}

model {
sigma_mu_1 ~ cauchy(0,5);
sigma_mu ~ cauchy(0,5);

sigma_delta_1 ~ cauchy(0,5);
sigma_delta ~ cauchy(0,5);

sigma_y ~ cauchy(0,5);
beta ~ normal(0,5);

mu[1] ~ normal(0, sigma_mu_1);
delta[1] ~ normal(0, sigma_delta_1);

for(t in 2:T){
mu[t] ~ normal(mu[t-1] + delta[t-1],sigma_mu);
delta[t] ~ normal(delta[t-1],sigma_delta);
}

nu ~ cauchy(0,5);

for(t in 1:T){
y[t] ~ student_t(nu,mu +X[t]*beta , sigma_y);
}
}

generated quantities {

vector[T] y_predict;
vector[T] log_lik;

for (t in 1:T) {
y_predict[t]  = student_t_rng(nu, mu[t]+X[t]*beta , sigma_y);
log_lik[t] = student_t_lpdf(y[t] |nu,mu[t]+X[t]*beta , sigma_y);

}
}

“”"

unmet_model = pystan.StanModel(model_code=model_code)

X = unmet_data[[“Children_in_1000_level”,“Unemploy_per_100_level”]]

data = dict(T=len(unmet_data), y=unmet_data.Unmet_need_level_per_100, X = X, P =(len(X.columns)))

fit_unmet_model = unmet_model.sampling(data=data,iter=1000, chains=4, control=dict(max_treedepth=12))
import arviz as az
idata = az.from_pystan(posterior=fit_unmet_model, posterior_predictive=[“y_predict”])
ax = az.plot_ppc(idata, data_pairs={“y”: “y_predict”})

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
** 81 import arviz as az**
** 82 idata = az.from_pystan(posterior=fit_unmet_model, posterior_predictive=[“y_predict”])**
—> 83 ax = az.plot_ppc(idata, data_pairs={“y”: “y_predict”})

~\Anaconda3\lib\site-packages\arviz\plots\ppcplot.py in plot_ppc(data, kind, alpha, mean, observed, color, grid, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, ax, backend, backend_kwargs, group, show)
** 196 for groups in ("{}_predictive".format(group), “observed_data”):**
** 197 if not hasattr(data, groups):**
→ 198 raise TypeError(
** 199 ’data argument must have the group “{group}” for ppcplot’.format(group=groups)**
** 200 )**

TypeError: data argument must have the group “observed_data” for ppcplot

1 Like

Oh yes, I had a missing part in the original line (observed_data)

idata = az.from_pystan(posterior=fit, posterior_predictive=["y_predict"], observed_data=["y"])

(I will also update the original answer)

1 Like

Thanks very much. appreciate it!