How to extract plot and predict from pystan3

EMNP · October 27, 2022, 8:59am

Hi, little premise: I’m a GIS-web developer, not a statistician and I understand very little of stan. Nevertheless, I have been tasked to convert an R-stan script into python pystan3 for a research project. The colleague who wrote the R code left some time ago and won’t help much so I was left on my own devices. Here is the stan-model code, my python implementation of the stan calls, and some demo data. I can run the model but I don’t know how to plot the predicted and observed variables on an x-y chart and extract the predicted ones given an x value.
Thank you in advance.

import stan
import numpy as np


model_code_2s = 
        data {
          int<lower=0> N; // number of measurements
        
          vector[N] h;
          vector[N] Q_meas;
          vector<lower=0>[N] Q_error;
          
          real mu_a1;
          real sigma_a1;
          real mu_b1;
          real sigma_b1;
          real mu_c1;
          real sigma_c1;
          real mu_k1;
          real sigma_k1;
          real mu_a2;
          real sigma_a2;
          real mu_c2;
          real sigma_c2;
          
          int<lower=0> N_ext;
          vector[N_ext] h_ext;
        }
        
        parameters {
          real<lower=0> a1;
          real<upper=min(h)> b1;
          real<lower=0> c1;
          
          real<lower=0> a2;
          real<lower=0> c2;
          
          real<lower=b1> k1;
          
          real<lower=0> sigma;
        }
        
        transformed parameters {
        
          real b2 = k1 - pow(a1/a2,1/c2)*pow(k1-b1,c1/c2);
          
          vector[N] log_mu;
          
          for (i in 1:N) {
            if (h[i] <= k1) {
              log_mu[i] = log(a1) + c1*log(h[i]-b1);
            } else {
              log_mu[i] = log(a2) + c2*log(h[i]-b2);
            }
          }
          
          vector[N] mu = exp(log_mu);
          vector[N] sigma_mu = mu*sigma;
          
          vector[N_ext] mu_ext;
          
          for (i in 1:N_ext) {
            if (h_ext[i] <= k1) {
              mu_ext[i] = a1*(h_ext[i] - b1)^c1;
            } else {
              mu_ext[i] = a2*(h_ext[i] - b2)^c2;
            }
          }
        
        }
        
        model {
        
          // priors
          a1 ~ normal(mu_a1, sigma_a1);
          b1 ~ normal(mu_b1, sigma_b1);
          c1 ~ normal(mu_c1, sigma_c1);
          
          k1 ~ normal(mu_k1, sigma_k1);
          
          a2 ~ normal(mu_a2, sigma_a2);
          c2 ~ normal(mu_c2, sigma_c2);
          
          sigma ~ uniform(0,10000);
          
          // likelihood
          Q_meas ~ normal(mu, ((sigma_mu)^2 + Q_error^2)^0.5);
        
        }
        
        generated quantities {
          
          vector[N] Q_res; // residuals Q_res = Q_meas - Q_mu (accounting for uncertainty in Q_meas)
        
          for (i in 1:N) {
            Q_res[i] = normal_rng(Q_meas[i],Q_error[i]) - normal_rng(mu[i],sigma_mu[i]);
          }
          
          vector[N_ext] Q_ext;
          
          for (i in 1:N_ext) {
            if (h_ext[i] <= k1) {
              Q_ext[i] = normal_rng(a1*(h_ext[i]-b1)^c1,mu_ext[i]*sigma);
            } else {
              Q_ext[i] = normal_rng(a2*(h_ext[i]-b2)^c2,mu_ext[i]*sigma);
            }
          }
          
          
        }

h = [3600.0, 2000.0, 1400.0, 1300.0]
Q_meas = [1110.2, 194.4, 90.6, 69.1]
Q_error = [61.3, 8.7, 4.8, 3.7]

N = len(h)
h_ext = np.arange(1300.0, 3600.0, 10) # min(h) to max(h) with 0.1 step
N_ext = len(h_ext)

data_to_fit = {
        'N': N,
        'h': h,
        'Q_meas': Q_meas,
        'Q_error': Q_error,
        'mu_a1': 20, 
        'sigma_a1': 10,
        'mu_b1': 0,
        'sigma_b1': 5,
        'mu_c1': 2,
        'sigma_c1': 0.5,
        'mu_k1': 4,
        'sigma_k1': 2,
        'mu_a2': 20,
        'sigma_a2': 10,
        'mu_c2': 2,
        'sigma_c2': 0.5,      
        'N_ext': N_ext,
        'h_ext': h_ext
        }
posterior = stan.build(model_code_2s, data = data_to_fit, random_seed = 42)
fit = posterior.sample(num_chains = 4, num_samples = 5000, num_warmup = 1000)

Operating System: Ubuntu 20.04
Python Version: 3.8
PyStan Version: 3

ahartikainen · October 28, 2022, 2:41pm

You can access the parameters from fit object as numpy arrays and after that it is a straight forward exercise to plot two variables on a scatter plot

    import matplotlib.pyplot as plt

    q_ext = fit["Q_ext"]
    # check the shape, 1 dim should be 'draw' and then 'obs' -> q_ext.shape
    for i in range(len(q_ext)):
        # you might want to limit this 100-200?
        # maybe even select random i
        # np.random.choice
        if i > 100:
            break
        plt.plot(h_ext, q_ext[i], c="k", alpha=0.1)

    plt.plot(h, Q_meas, lw=2, marker="o", markersize=8, color="r")

You might want to check ArviZ for different diagnostics and mcmc plots

https://python.arviz.org

Topic		Replies	Views
Generate predictions from pyStan after model fit in Stan 3 Modeling	4	421	July 25, 2023
Pystan sampling from the posterior predictive PyStan	8	3073	October 30, 2020
Posterior predictive plot in when using PyStan General pystan , python , arviz	4	1895	June 4, 2021
Generated Quantities for prediction data General	6	1085	July 28, 2020
Attempt to generate prior predictive samples from SIR model PyStan epidemiology , prior-predictive	6	1508	May 30, 2020

How to extract plot and predict from pystan3

Related topics