Function to specify initial values in brms model has dimension mismatch

#1

Hi,

I am trying to fit a diffusion model to reaction time / response data using brms with the wiener family from the rwiener package. I’ve been mostly following the excellent series of blog posts by Hendrik Singmann to get started.

I encounter a lot of problems getting the model to sample, which seem to largely be because the ndt parameter limits the possible values of all other parameters, leading to some nasty posterior geometries. Mostly, chains fail to initialise after the default number of attempts, but if sampling does start then it’s very slow, with lots of rejected steps.

To solve the initialisation problems I’m trying to follow the strategy used by Hendrik, by providing valid initial values to all parameters in all chains, but I encounter an error after compilation. Briefly, after specifying the model and priors, Hendrik uses the make_standata function from brms to generate a valid dataset for the model, then uses this dataset to recover the dimensions of the parameter vectors:

tmp_dat <- make_standata(model_formula,
                         family = wiener(link_bs = "log",
                                         link_ndt = "log",
                                         link_bias = "logit"),
                         data = dat, prior = priors)
# str(tmp_dat, 1, give.attr = FALSE)
# List of 34
# $ N          : int 9492
# $ Y          : num [1:9492(1d)] 3.66 5.85 5.04 5.26 3.52 ...
# $ dec        : num [1:9492(1d)] 1 0 1 1 1 0 0 0 0 0 ...
# $ K          : int 8
# $ X          : num [1:9492, 1:8] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_1_1      : num [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_2_1      : num [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_2_2      : num [1:9492(1d)] -1 1 -1 -1 -1 -1 1 1 1 1 ...
# $ Z_2_3      : num [1:9492(1d)] 1 1 1 1 1 1 1 1 -1 -1 ...
# $ Z_2_4      : num [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_2_5      : num [1:9492(1d)] -1 1 -1 -1 -1 -1 1 1 -1 -1 ...
# $ Z_2_6      : num [1:9492(1d)] -1 1 -1 -1 -1 -1 1 1 1 1 ...
# $ Z_2_7      : num [1:9492(1d)] 1 1 1 1 1 1 1 1 -1 -1 ...
# $ Z_2_8      : num [1:9492(1d)] -1 1 -1 -1 -1 -1 1 1 -1 -1 ...
# $ K_bs       : int 2
# $ X_bs       : num [1:9492, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_2_bs_9   : num [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_2_bs_10  : num [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ K_ndt      : int 1
# $ X_ndt      : num [1:9492, 1] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_2_ndt_11 : num [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ K_bias     : int 2
# $ X_bias     : num [1:9492, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_2_bias_12: num [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ Z_2_bias_13: num [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ J_1        : int [1:9492(1d)] 31 13 53 36 44 39 24 17 63 90 ...
# $ N_1        : int 255
# $ M_1        : int 1
# $ NC_1       : num 0
# $ J_2        : int [1:9492(1d)] 1 1 1 1 1 1 1 1 1 1 ...
# $ N_2        : int 20
# $ M_2        : int 13
# $ NC_2       : num 78
# $ prior_only : int 0

initfun <- function(chain_id = 1) {
  list(
    b = rnorm(tmp_dat$K),  
    b_bs = runif(tmp_dat$K_bs, -1, -0.5),  # log scale
    b_ndt = runif(tmp_dat$K_ndt, -1.5, -0.5),  # log scale
    b_bias = rnorm(tmp_dat$K_bias, -0.05, 0.05),  # logit scale
    sd_1 = runif(tmp_dat$M_1, 0.01, 0.5), # im filename
    sd_2 = runif(tmp_dat$M_2, 0.01, 0.5), # subj init
    z_1 = matrix(rnorm(tmp_dat$M_1*tmp_dat$N_1, 0, 0.01),
                 tmp_dat$M_1, tmp_dat$N_1),
    z_2 = matrix(rnorm(tmp_dat$M_2*tmp_dat$N_2, 0, 0.01),
                 tmp_dat$M_2, tmp_dat$N_2),
    L_1 = diag(tmp_dat$M_1),
    L_2 = diag(tmp_dat$M_2)
  )
}

Then initfun is passed to the inits argument of brm.

I receive the following error:

Compiling the C++ model
Start sampling
starting worker pid=373 on localhost:11387 at 10:48:23.123
starting worker pid=384 on localhost:11387 at 10:48:23.457

SAMPLING FOR MODEL 'bd6f1efcaf83b4a757628a38dc94dd61' NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: mismatch in dimension declared and found in context; processing stage=initialization; variable name=b; position=0; dims declared=(7); dims found=(8)
[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                                                                                      
[2] "  mismatch in dimension declared and found in context; processing stage=initialization; variable name=b; position=0; dims declared=(7); dims found=(8)"
error occurred during calling the sampler; sampling not done

I can run Hendrik’s code (see blog linked above) with no problem. I think probably what is happening is that Hendrik uses a no-intercept parameterisation of his model, whereas my model formula uses an intercept. I know that brms treats the intercept slightly differently, so I’m wondering whether this is the source of the problem. I’m implementing a no-intercept version of my model now, but any suggestions as to how to get around this in the meantime are appreciated.

  • Operating System:
    [Docker container]
    R version 3.5.2 (2018-12-20)
    Platform: x86_64-pc-linux-gnu (64-bit)
    Running under: Debian GNU/Linux 9 (stretch)

  • brms Version: 2.7.1

#2

In models with an intercept on ndt (log-scale), I specify initial values only on this intercept:

chains <- 4
inits_drift <- list(temp_ndt_Intercept = -3)
inits_drift <- replicate(chains, inits_drift, simplify = FALSE)

Combined with init_r = 0.05 this was sufficient for all cases I have tried so far. For non-intercept models, you may use a different approach but the goal is to have some small initial values for the ndt.

#3

Thanks for the reply Paul. Passing inits_drift to brm(inits = inits_drift, ...) seems to allow sampling to start!

Could you add a sentence more to help me understand what you’re doing there? I don’t know what the init_r argument is (doesn’t seem to be in brm or stan). Is temp_ndt_Intercept a special parameter name recognised by brm?

Aside: I tried specifying my model with no intercept as in Hendrik’s case and I get the same error, so whatever is causing the dimension mismatch is something in my model, not the use of an intercept per se.

#4

Temporary intercepts are intercepts if all predictors are centered around zero. This is explained somewhere in ?bf and ?set_prior.

temp_ndt_Intercept is just the temporary intercept of the non-decision time.

#5

Ok, thanks Paul.