I’m using RStan to fit a 2 parameter IRT model (similar to the one in the manual) to ordinal data, so using ordered logistic likelihood.
I only need a MAP estimate, so I’m using optimizing for speed. However, the same problems happen with sampling - it just takes longer.
The optimizer converges well on my laptop. But I’m also working on an Ubuntu 18.04 cloud instance, and it worked well there until I had to reinstall rstan, due to a change to the Ubuntu package repository. I now install it from source.
The same model, with the same data, fails to converge on the cloud instance, and fails to mix when I use sampling. To reiterate - it runs fine on my laptop. When I use normal likelihood on the cloud instance, the fitting converges.
This is sessionInfo for my laptop:
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gsubfn_0.7 proto_1.0.0 stringdist_0.9.5.5 stringr_1.4.0 rstan_2.19.3
[6] StanHeaders_2.21.0-1 cowplot_1.0.0 ggplot2_3.3.0 data.table_1.12.8
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 pillar_1.4.3 compiler_3.6.0 prettyunits_1.1.1 tools_3.6.0
[6] digest_0.6.25 packrat_0.5.0 pkgbuild_1.0.6 lifecycle_0.2.0 tibble_3.0.0
[11] gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.5 cli_2.0.2 rstudioapi_0.11
[16] parallel_3.6.0 yaml_2.2.1 xfun_0.13 loo_2.2.0 gridExtra_2.3
[21] withr_2.1.2 dplyr_0.8.5 knitr_1.28 vctrs_0.2.4 stats4_3.6.0
[26] grid_3.6.0 tidyselect_1.0.0 glue_1.4.0 inline_0.3.15 R6_2.4.1
[31] processx_3.4.2 fansi_0.4.1 farver_2.0.3 purrr_0.3.3 callr_3.4.3
[36] magrittr_1.5 codetools_0.2-16 matrixStats_0.56.0 ps_1.3.2 scales_1.1.0
[41] ellipsis_0.3.0 assertthat_0.2.1 colorspace_1.4-1 labeling_0.3 stringi_1.4.6
[46] munsell_0.5.0 crayon_1.3.4
This is sessionInfo() for the cloud instance:
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gsubfn_0.7 proto_1.0.0 stringdist_0.9.5.5 stringr_1.4.0 rstan_2.19.3 StanHeaders_2.19.2 ggplot2_3.3.0 data.table_1.12.8
loaded via a namespace (and not attached):
[1] Rcpp_1.0.4.6 pillar_1.4.3 compiler_3.6.0 prettyunits_1.1.1 tools_3.6.0 packrat_0.5.0 pkgbuild_1.0.6 lifecycle_0.2.0 tibble_3.0.1
[10] gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.5 cli_2.0.2 rstudioapi_0.11 parallel_3.6.0 loo_2.2.0 gridExtra_2.3 withr_2.2.0
[19] dplyr_0.8.5 vctrs_0.2.4 stats4_3.6.0 grid_3.6.0 tidyselect_1.0.0 glue_1.4.0 inline_0.3.15 R6_2.4.1 processx_3.4.2
[28] fansi_0.4.1 tcltk_3.6.0 purrr_0.3.4 callr_3.4.3 magrittr_1.5 codetools_0.2-16 matrixStats_0.56.0 scales_1.1.0 ps_1.3.2
[37] ellipsis_0.3.0 assertthat_0.2.1 colorspace_1.4-1 stringi_1.4.6 munsell_0.5.0 crayon_1.3.4
and this is the model:
data {
int<lower=1> J; // number of participants
int<lower=1> K; // number of questions
int<lower=1> N; // number of observations
int<lower=1,upper=J> jj[N]; // participant for observation n
int<lower=1,upper=K> kk[N]; // question for observation n
int y1[N]; // rating 1 for observation n
int y2[N]; // rating 2 for observation n
}
parameters {
real mu; // grand average
vector[J] alpha; // non-scaled intercept of participants
vector[K] beta; // relative usefulness level of questions
vector<lower=0>[J] gamma; // sensitivity of participant
real mu_gamma; // Intercept of sensitivities
real<lower=0> sigma_alpha; // scale of participant intercepts
real<lower=0> sigma_gamma; // scale of log sensitivity
ordered[6] c1; // Cut off points for categorical regression
ordered[6] c2; // Cut off points for categorical regression
}
transformed parameters{
vector[N] pred;
pred = sigma_alpha * alpha[jj] + gamma[jj] .* beta[kk] + mu;
}
model {
alpha ~ std_normal();
beta ~ std_normal();
gamma ~ lognormal(mu_gamma, sigma_gamma);
mu ~ std_normal();
mu_gamma ~ std_normal();
sigma_alpha ~ std_normal();
sigma_gamma ~ std_normal();
// Likelihood
y1 ~ ordered_logistic(pred, c1);
y2 ~ ordered_logistic(pred, c2);
}
Any help much appreciated!