Operating System: OSX 10.11.6
Interface Version: latest
Compiler/Toolkit: I don’t know, installed Stan with Xcode
Hello,
I am trying to understand how posterior_linpred
works with ordinal models.
I am mostly familiar with brms
and understand the output of the method fitted
, which is the equivalent function of posterior_linpred
for brms
.
For example,
library(brms)
library(rethinking) # for data
data(Trolley)
d <- Trolley
rstan_options (auto_write=TRUE)
options (mc.cores=parallel::detectCores ()) # Run on multiple cores
# takes a few minutes to fit
mod.male.contact <- brm(response ~ male*contact, data=d,
family=cumulative("logit"), threshold="flexible",
prior=set_prior("normal(0,10)", class='b'),
warmup = 1000, iter = 2000, chains = 2)
newdata_to_predict <- data.frame(
male = c(0,0,1,1),
contact = c(0,1,0,1)
)
# equivalent to posterior_linpred in brms
fitted(mod.male.contact, newdata_to_predict, scale='response')
# Sample output:
# Gives probabilities for each ordinal response.
#
# , , P(Y = 1)
#
# Estimate Est.Error 2.5%ile 97.5%ile
# 1 0.14739169 0.004612668 0.13865579 0.15633546
# 2 0.21814262 0.010689809 0.19683522 0.23901573
# 3 0.08718669 0.003023406 0.08156363 0.09314114
# 4 0.16005079 0.008045255 0.14549309 0.17686909
#
# , , P(Y = 2)
#
# Estimate Est.Error 2.5%ile 97.5%ile
# 1 0.10261588 0.003504869 0.09565658 0.10934455
# 2 0.13158680 0.005212364 0.12163682 0.14195957
# 3 0.06834966 0.002600132 0.06340807 0.07339157
# 4 0.10860616 0.004753855 0.09939101 0.11784792
# ... continues P(Y=y) for y = 1, 2, ... , 7
Here, I understand the output. For each set of predictor values, I get the mean probability and percentiles for each possible response 1-7.
However in rstanarm
, I am uncertain how I should interpret the output of posterior_linpred
:
library(rstanarm)
d$response <- as.factor(d$response)
# not sure if right syntax or model. Still new to rstanarm
mod.arm <- stan_polr(response ~ male*contact, data=d,
prior = R2(0.25), prior_counts = dirichlet(1))
posterior_linpred(mod.arm, transform=TRUE, newdata = newdata_to_predict)
# Sample output:
#
# iterations 1 2 3 4
# [1,] 0.5 0.3892393 0.6464041 0.4612909
# [2,] 0.5 0.3603153 0.6361793 0.4886859
# [3,] 0.5 0.3733287 0.6429214 0.4761588
# [4,] 0.5 0.3792128 0.6512925 0.4619251
# [5,] 0.5 0.3711230 0.6322236 0.4860978
# [6,] 0.5 0.3931835 0.6462488 0.4654676
# ..... more rows
I see a column for each set of predictors, but I’m not certain how to interpret the probabilities. With an ordinal model I should have a unique set of probabilities for each response. How do I recover that from the output of posterior_linpred
?
So I see two potential problems.
-
I’m not fitting the model right. The
rstanarm
model is something slightly different than thebrms
model. Although they both have identical summaries. -
I’m not using the right function. Maybe I need something different than
posterior_linpred
?
Thanks for help! The code should be reproducible. Let me know if there are any troubles.