Comparing prior and posterior parameter densities extracted from PyStan with Arviz?

Hi All,

I found this page in the Arviz documentation describing the function arviz.plot_dist_comparison to compare priors and posteriors, but I’m unfamiliar with how to go about getting the prior out of PyStan to compare with the posterior. Having previously used RStan in conjunction with Bayesplot, I recall that I was able to call mcmc_areas from Bayesplot after performing an as.array(...) conversion of the stan_fit object, and I imagine that a similar process happens for PyStan/Arviz for the theta posteriors, but I’m not sure what the best approach is for getting the priors out of PyStan and am looking for examples. Is it the case where I need to do sample form the priors with normal_rng in the generated quantities block of my Stan code? Thanks for the help!

Hi

Check this page Stan User’s Guide

And to input them into InferenceData check this

https://arviz-devs.github.io/arviz/api/generated/arviz.from_pystan.html

So basically create one model and fit for prior and one for posterior.

1 Like

Thank you!

Actually @ahartikainen, I did have one more question. So, in my case, I’m using data from a known data generating process where I know the actual parameter values. Is there a straightforward way in Arviz to plot the “known” data generating parameter values as vertical lines over the prior/posterior density functions?

The output from that function is matplotlib axis / bokeh figure so the easiest way is to add those lines manually.

E.g

ax = az.plot_...
ax.axvline(x)

https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.axvline.html

1 Like

These two examples from my probprog poster and from arviz docs on populating all groups can also be helpful. They still use pystan 2 though.

To try and add a bit more to @ahartikainen’s answer on axvline, if you have the true values as an xarray dataset (from using arviz.from_dict for example, use observed_data to take into account that there should be no chain nor draw dimension) you can use xarray_var_iter to get a list of var_name, selected_coords, selected_indexs, value. You should then be able to do

truths = az.from_dict(...)  # or raw xarray.Dataset might be better
axes = az.plot_dist_comparison(...)

for ax, (_, _, _, value) in zip(axes[:, -1], xarray_var_iter(truths)):
    ax.asvline(value)

Plotting functions use these helpers under the hood, so the order will be the same if the dataset variables are in the same order, however, to make sure order is aligned it would be a good idea to use var_names with the same variables and order when calling plot_dist_comparison and xarray_var_iter.

1 Like