Stan_mvmer error when using 4 outcomes, ok for 3 or fewer


#1

I have four outcomes with four formulas to be modeled in a multivariate way. Running stan_mvmer with all four formulas results in the error:
Error in rep(list(0L), pad_length - length(ret)) :
invalid ‘times’ argument
If I comment out any one or two of the four formulas (and the corresponding data frames) then it runs fine. Is there a three-outcome limit to stan_mvmer?


#2

Not deliberately


#3

Ok, in that case, here is a simple script that reproduces the error:

library(rstanarm)
df = mtcars
# 3-outcome works:
fit = stan_mvmer(formula=list(mpg  ~ (1|cyl), disp ~ (1|cyl), qsec ~ (1|cyl)),
                 data=df, iter=1000, seed=1, chains=1, cores=1, open_progress=FALSE)
# 4-outcome doesn't work:
fit = stan_mvmer(formula=list(mpg  ~ (1|cyl), disp ~ (1|cyl), hp ~ (1|cyl), qsec ~ (1|cyl)),
                 data=df, iter=1000, seed=1, chains=1, cores=1, open_progress=FALSE)

#4

The stan_jm.fit function that is called by stan_mvmer has some hard coded 3s. See for instance

  # Dimensions
  standata$has_aux <- 
    fetch_array(y_mod, "has_aux", pad_length = 3)
  standata$resp_type <- 
    fetch_array(y_mod, "y", "resp_type", pad_length = 3)
  standata$intercept_type <- 
    fetch_array(y_mod, "intercept_type", "number", pad_length = 3)
  standata$yNobs <- 
    fetch_array(y_mod, "x", "N", pad_length = 3)
  standata$yNeta <- 
    fetch_array(y_mod, "x", "N", pad_length = 3) # same as Nobs for stan_mvmer
  standata$yK <- 
    fetch_array(y_mod, "x", "K", pad_length = 3)

fetch_array is a wrapper for fetch which has this bit of code.

  if (!is.null(pad_length)) {
    padding <- rep(list(0L), pad_length - length(ret))
    ret <- c(ret, padding)

And that looks a lot like the error message. My guess is that pad_length = 3 and length(ret) = 4 with the 4-outcome model and rep is not happy with times = -1


#5

Actually, yeah there is a hard limit on the number of outcomes.

stan_mvmer (and stan_jm) are limited to a maximum of 3 outcomes, and maximum of 2 grouping factors. It appears I included a stop() for the latter (see here) but not for the former.

Hard coding these massively sped up the code. Without it, the design matrices for the separate outcomes needed to be diagonally bound and then either split up later when constructing the linear predictor for the model (which may be reasonable), or the linear predictor needed to be constructed using the single block diagonal design matrix (which is slow). If the sparse matrix multiplication in Stan is sped up in the future then it might be reasonable to relax this restriction, so that the user can specify any number of outcomes & grouping factors, but for now I thought the speed advantages seemed too great (particularly for stan_jm, which can be slow in any case).