Associate posterior probablity checks with covariates - using cmdstanR

kgoldfeld · July 26, 2021, 7:31pm

I am using cmdstanR to estimate a somewhat complicated model, and am trying to do a visual posterior probability check. I know this is a very simple/basic question, but I can’t seem to find the solution. I can generate the predicted values easily in stan - but I can’t figure out how to easily match the predictions with the covariates used to generate them. There must be an a easy way.

I have a much simpler example using a regression model. Here is the data generation process:

library(simstudy)
library(ggplot2)

d1 <- defData(varname = "x", formula = "0;10", dist="uniformInt")
d1 <- defData(d1, varname = "y", formula = "5 + 6*x - .3*x^2", variance = 2, dist = "normal")

set.seed(1)
dd <- genData(101, d1)

ggplot(data = dd, aes(x=x, y=y)) +
  geom_jitter(height = 0, width = .1, size = 1)

Here is the Stan code:

data {
  int<lower=0> N;
  vector[N] x;
  vector[N] y;
}

parameters {
  real alpha;
  real beta;
  real<lower=0> sigma;
}


model {
  y ~ normal(alpha + beta*x, sigma);
}

generated quantities {
  real y_rep[N] = normal_rng(alpha + beta*x, sigma);
}

And here is the R cmdstanR code:

mod <- cmdstan_model("simple_regression.stan")

fit <- mod$sample(
  data = list(N = nrow(dd), x = dd$x, y = dd$y),
  refresh = 0,
  chains = 2L,
  parallel_chains = 2L,
  iter_warmup = 500,
  iter_sampling = 2500
)

fit$summary()
posterior <- as_draws_array(fit$draws())

The question is - is there an obvious way to attach the predicted values y_rep[1], y_rep[2], etc, with each of the observed values for i=1, 2, …? I am sure I can write code to do this, but if there is a function that will facilitate this, that would be great. I’d like to generate an interval to overlay the observed data. (Clearly, in this case, my model should not be a great fit.)

mike-lawrence · August 3, 2021, 9:54pm

Might the new rvars format make things easier for you?

kgoldfeld · August 6, 2021, 4:42pm

Thanks so much for that excellent suggestion - took me a little bit to understand the what the rvars format is and what it is doing. But as I start to get the hang of it, it is clearly very powerful and super easy to work with. It is exactly what I was looking for. Thanks.

mike-lawrence · August 6, 2021, 4:48pm

If you have time, post the code you end up with; I’m sure others (inc. me!) would find it useful 🙂

kgoldfeld · August 6, 2021, 4:53pm

I am going to do a short blog post on it, and when it is up, I will link to it from here.

kgoldfeld · August 10, 2021, 2:45am

As promised - I have some code (and probably too much description) up on my blog.

Topic		Replies	Views
CmdSTAN and posterior prediction Modeling	6	1423	March 21, 2019
Help Extracting Fitted Values from CmdStanR Modeling bayesplot , cmdstanr , posterior-predictive	1	1319	January 13, 2021
Quick question about graphical prior predictive check Modeling	5	620	October 10, 2022
"Bayesian Cognitive Modeling" in cmdstanr CmdStan cognitive-science	5	1209	September 6, 2021
How to pass matrix with dimensions (0 x n) to Stan with cmdstanr Interfaces cmdstanr	3	47	June 20, 2025

Associate posterior probablity checks with covariates - using cmdstanR

Related topics