Posterior predictive checking for multinomial logit

Greetings all. I’m trying to figure you how to get posterior predictive checks for mutinomial regression. Here is the code.


title: “MultiNomialReg.stan”
date: “10/16/2020”
output: html_document

knitr::opts_chunk$set(echo = TRUE)
library(rstan)
library(bayesplot)
options(mc.cores = 3)

Read in data

male: (1=male)

ASBG04: # of books in home 1) 0-10, 2) 11-25, 3) 26-100, 4) 101-200 5) More than 200

ASBGSMR: Students motivated to read (4pt, strongly agree to strongly disagree)

ASBR07A: I do well in reading

Canadareg <-read.csv("~/desktop/Canada.csv",header=TRUE) #browse to select data "Canada.csv"
Canadareg <- Canadareg[1:5000,]  # Just to cut down the sample size.
Canadareg[Canadareg==999999]=NA
Canadareg <- na.omit(Canadareg)
male <- ifelse(Canadareg$ITSEX==2,0,1)
Canadareg1 <- subset(Canadareg,select=c(ASBR07A, male, ASBG04,ASBGSMR))
Canadareg1 <- data.frame(Canadareg1)
N <- nrow(Canadareg1)
f <- as.formula("ASBR07A ~ male")
M <- model.matrix(f,Canadareg1)
data.list <- list(N=nrow(M), 
                  K=length(unique(Canadareg1[,1])),
                  D=ncol(M), 
                  x=M, 
                  ASBR07A=as.numeric(Canadareg1[,1]))

modelString <- "

data {
  int <lower=2> K;              // This is 4, the number of outcomes categories
  int <lower=0> N;            
  int <lower=1> D;              // This is the number of columns in the design                                    matrix - 2
  int <lower = 1, upper = K> ASBR07A[N];
  matrix[N, D] x;              // This will be N by 2 matrix of data
}

parameters {
  matrix[D, K] beta;     // This is a 2 x 4 matrix of betas
}


model {
  matrix[N, K] x_beta = x * beta;   //  N x 2  * 2  x 4 
  
  to_vector(beta) ~ normal(0, 5);

  for (i in 1:N)
    ASBR07A[i] ~ categorical_logit(x_beta[i]');
}

generated quantities {
 int<lower=1,upper=K> ASBR07A_rep[N];
  for (i in 1:N){
   ASBR07A_rep[i] = categorical_logit_rng(matrix beta);
 }
}

"

Start estimation

nChains = 3
nIter= 10000
thinSteps = 10
burnInSteps = floor(nIter/2)

ASBR07A = data.list$ASBR07A

MultiNomRegFit = stan(data=data.list,model_code=modelString,chains=nChains,
              iter=nIter,warmup=burnInSteps,thin=thinSteps)

Stan plots

stan_plot(MultiNomRegFit, pars=c("beta"))
stan_trace(MultiNomRegFit,inc_warmup=T, pars=c("beta"))
stan_dens(MultiNomRegFit, pars=c("beta"))
stan_ac(MultiNomRegFit, pars=c("beta"))

These plots show good evidence of convergence for all parameters in the model.

Stan output

print(MultiNomRegFit,pars=c("beta"))

Posterior predictive checks

ASBR07A_rep = extract(MultiNomRegFit,"ASBR07A_rep")$ASBR07A_rep
ppc_bars(y=ASBR07A, yrep= ASBR07A_rep)

The error message I am receiving is

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Variable “matrix” does not exist.
error in ‘model93e3d1c8db5_9c080fd17354e62514567a23c3da5154’ at line 28, column 48

26:  int<lower=1,upper=K> ASBR07A_rep[N];
27:   for (i in 1:N){
28:    ASBR07A_rep[i] = categorical_logit_rng(matrix beta);
                                                   ^
29:  }

Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model ‘9c080fd17354e62514567a23c3da5154’ due to the above error.

I have tried other values for categorical_logit_rng(matrix beta), including (beta) and (vector beta) to no avail - same error message.

Thanks in advance for your suggestions.

David

If you’re trying to generate data according to your likelihood:

for (i in 1:N)
    ASBR07A[i] ~ categorical_logit(x_beta[i]');

then use:

int<lower=1,upper=K> ASBR07A_rep[N];
for (i in 1:N){
  ASBR07A_rep[i] = categorical_logit_rng(x_beta[i]');
}

The docs for this one are over here (and some of the functions can be pretty picky so always worth checking): https://mc-stan.org/docs/2_25/functions-reference/categorical-distribution.html

That work?

Thanks, but I’m afraid it didn’t work.

Here is the error message

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Variable "x_beta" does not exist.
 error in 'model254b51e596d_5340eb0229dbdee84ac2f7d20844c9a3' at line 27, column 47
  -------------------------------------------------
    25: int<lower=1,upper=K> ASBR07A_rep[N];
    26: for (i in 1:N){
    27:   ASBR07A_rep[i] = categorical_logit_rng(x_beta[i]');
                                                      ^
    28: }
  -------------------------------------------------

Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  failed to parse Stan model '5340eb0229dbdee84ac2f7d20844c9a3' due to the above error.

Here is the code,

modelString <- "

data {
  int <lower=2> K;              // This is 4, the number of outcomes categories
  int <lower=0> N;            
  int <lower=1> D;              // This is the number of columns in the design                                    matrix - 2
  int <lower = 1, upper = K> ASBR07A[N];
  matrix[N, D] x;              // This will be N by 2 matrix of data
}

parameters {
  matrix[D, K] beta;     // This is a 2 x 4 matrix of betas
}


model {
  matrix[N, K] x_beta = x * beta;   //  N x 2  * 2  x 4 
  
  to_vector(beta) ~ normal(0, 5);

  for (i in 1:N)
    ASBR07A[i] ~ categorical_logit(x_beta[i]');
}
generated quantities {
int<lower=1,upper=K> ASBR07A_rep[N];
for (i in 1:N){
  ASBR07A_rep[i] = categorical_logit_rng(x_beta[i]');
}
}
"

Any variables declared in the model block can only used within the model block. If you want to be able to use transformations in both the model and generated quantities block, then you need to use transformed parameters:

modelString <- "

data {
  int <lower=2> K;              // This is 4, the number of outcomes categories
  int <lower=0> N;            
  int <lower=1> D;              // This is the number of columns in the design                                    matrix - 2
  int <lower = 1, upper = K> ASBR07A[N];
  matrix[N, D] x;              // This will be N by 2 matrix of data
}

parameters {
  matrix[D, K] beta;     // This is a 2 x 4 matrix of betas
}
transformed parameters {
  matrix[N, K] x_beta = x * beta;   //  N x 2  * 2  x 4 
}

model {
  to_vector(beta) ~ normal(0, 5);

  for (i in 1:N)
    ASBR07A[i] ~ categorical_logit(x_beta[i]');
}
generated quantities {
int<lower=1,upper=K> ASBR07A_rep[N];
  for (i in 1:N){
    ASBR07A_rep[i] = categorical_logit_rng(x_beta[i]');
  }
}
"
1 Like