Greetings all. I’m trying to figure you how to get posterior predictive checks for mutinomial regression. Here is the code.
title: “MultiNomialReg.stan”
date: “10/16/2020”
output: html_document
knitr::opts_chunk$set(echo = TRUE)
library(rstan)
library(bayesplot)
options(mc.cores = 3)
Read in data
male: (1=male)
ASBG04: # of books in home 1) 0-10, 2) 11-25, 3) 26-100, 4) 101-200 5) More than 200
ASBGSMR: Students motivated to read (4pt, strongly agree to strongly disagree)
ASBR07A: I do well in reading
Canadareg <-read.csv("~/desktop/Canada.csv",header=TRUE) #browse to select data "Canada.csv"
Canadareg <- Canadareg[1:5000,] # Just to cut down the sample size.
Canadareg[Canadareg==999999]=NA
Canadareg <- na.omit(Canadareg)
male <- ifelse(Canadareg$ITSEX==2,0,1)
Canadareg1 <- subset(Canadareg,select=c(ASBR07A, male, ASBG04,ASBGSMR))
Canadareg1 <- data.frame(Canadareg1)
N <- nrow(Canadareg1)
f <- as.formula("ASBR07A ~ male")
M <- model.matrix(f,Canadareg1)
data.list <- list(N=nrow(M),
K=length(unique(Canadareg1[,1])),
D=ncol(M),
x=M,
ASBR07A=as.numeric(Canadareg1[,1]))
modelString <- "
data {
int <lower=2> K; // This is 4, the number of outcomes categories
int <lower=0> N;
int <lower=1> D; // This is the number of columns in the design matrix - 2
int <lower = 1, upper = K> ASBR07A[N];
matrix[N, D] x; // This will be N by 2 matrix of data
}
parameters {
matrix[D, K] beta; // This is a 2 x 4 matrix of betas
}
model {
matrix[N, K] x_beta = x * beta; // N x 2 * 2 x 4
to_vector(beta) ~ normal(0, 5);
for (i in 1:N)
ASBR07A[i] ~ categorical_logit(x_beta[i]');
}
generated quantities {
int<lower=1,upper=K> ASBR07A_rep[N];
for (i in 1:N){
ASBR07A_rep[i] = categorical_logit_rng(matrix beta);
}
}
"
Start estimation
nChains = 3
nIter= 10000
thinSteps = 10
burnInSteps = floor(nIter/2)
ASBR07A = data.list$ASBR07A
MultiNomRegFit = stan(data=data.list,model_code=modelString,chains=nChains,
iter=nIter,warmup=burnInSteps,thin=thinSteps)
Stan plots
stan_plot(MultiNomRegFit, pars=c("beta"))
stan_trace(MultiNomRegFit,inc_warmup=T, pars=c("beta"))
stan_dens(MultiNomRegFit, pars=c("beta"))
stan_ac(MultiNomRegFit, pars=c("beta"))
These plots show good evidence of convergence for all parameters in the model.
Stan output
print(MultiNomRegFit,pars=c("beta"))
Posterior predictive checks
ASBR07A_rep = extract(MultiNomRegFit,"ASBR07A_rep")$ASBR07A_rep
ppc_bars(y=ASBR07A, yrep= ASBR07A_rep)
The error message I am receiving is
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Variable “matrix” does not exist.
error in ‘model93e3d1c8db5_9c080fd17354e62514567a23c3da5154’ at line 28, column 48
26: int<lower=1,upper=K> ASBR07A_rep[N];
27: for (i in 1:N){
28: ASBR07A_rep[i] = categorical_logit_rng(matrix beta);
^
29: }
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model ‘9c080fd17354e62514567a23c3da5154’ due to the above error.
I have tried other values for categorical_logit_rng(matrix beta), including (beta) and (vector beta) to no avail - same error message.
Thanks in advance for your suggestions.
David