Trial-by-Trial Boundaries in a BRMS Mixture Model with Truncation


#1

Hey Everyone!

I am trying to fit a mixture model combining a custom uniform distribution with a Gaussian distribution. I can do that just fine, but the dependent measure is unusual in that it is bounded, with the precise upper and lower limits varying from trial to trial. The reason for this is that the measure itself can be conceptualized as the difference between a hidden reference point and some other point selected by a research participant on a line of a set length. For the sake of argument, let’s say the line ranges from 0 to 100. On any given trial, it is assumed that the response will be sampled from either a uniform distribution (participants are guessing where the reference point is) or Gaussian distribution (participants know roughly where the point is but there is some variability in their response). Importantly, if the reference point is 10 for a given trial the difference between reference and the point selected by the participant can range from -10 to 90; if the reference point is 90 the difference between these points can range from -90 to 10.

My question is whether there is a simple way to incorporate this information into my model without dropping into base Stan code. I could do it in Stan code, but it would be better if I did not need to do so. I am using a custom distribution to implement the uniform component of the mixture model and imagine any solution would involve generating a custom Gaussian as well (as in Getting around truncation limitation for mixture models in brms).

Operating System: MacOS 10.14
brms Version: 2.7.0


#2

Getting truncation to work for custom families requires you to also provide the _lcdf and _lccdf functions of your custom family. Once that’s done, truncation should work I guess. But it is very well possible that I did not entirely understand your problem.


#3

Thanks for the tip, Paul!
My last post was vague because I was still unsure how to approach the problem, but I have made some progress since. I had been reading somewhere that standard truncation notation was not implemented for mixture models, but that is either outdated or untrue. As you say, once I implemented the _lcdf, the standard notation seems to work. Strangely, I did not need to implement the _lccdf to get it to run, but I have done so now. That code reads as follows and is based on formulae I got from wikipedia:

uniform_tones <- custom_family(
  "uniform_tones",
  type = "real",
  vars = c('lb[n]', 'ub[n]')
)

stan_funs <- "
  real uniform_tones_lpdf(real y, real mu, real lb, real ub) {
    return -log(ub - lb);
  }
  
  real uniform_tones_lcdf(real y, real mu, real lb, real ub) {
    if(y < lb)
    {
       return(-7); // Returning log(.001) to avoid log(0)
    }
    if(y >= ub)
    {
       return(0); // Returning log(1)
    }
    
    return log((y - lb) / (ub - lb));
  }
  
  real uniform_tones_lccdf(real y, real mu, real lb, real ub) {
    if(y > ub)
    {
       return(-7); // Returning log(.001) to avoid log(0)
    }
    if(y <= lb)
    {
       return(0); // Returning log(1)
    }
    
    return log(1 - ((y - lb) / (ub - lb)));
  }
"

Appended below is a csv file containing sample data and an R file containing some code depicting my current dilemma. This code fits a mixture model with a normal and uniform component. I attempt to model both theta (proportion of normal trials) and sigma (standard deviation of the normal component) as a function of several predictors (x1 through x4) – including one interaction (x1 * x2) and a random intercept (subject_number). The code is here:

full_model = bf(diff | trunc(lb=lb, ub=ub) ~ 1, 
                sigma1 ~ x1 * x2 + x3 + x4 + (1 | P | subject_number), 
                theta1 ~ x1 * x2 + x3 + x4 + (1 | P | subject_number),
                mu1 ~ 0,
                mu2 ~ 0)

m1c <- brm( full_model, data = dat, chains = 4, 
            iter = 100, warmup = 50, cores=4,
            family=mix, stanvars = stanvars,
            prior = m1.prior)

As indicated in my earlier post, the variable being modelled (diff) is a difference between a point on a target line and a point selected by the participant on that trial. Because we are using a line, the range of values a participant could select on a given trial depends on where the target point occurs (i.e., if it is near the lefthand side of the line, participants can’t undershoot by as much since they can’t go much farther left). In terms of modelling assumptions, it is assumed that the mean of the normal distribution is 0 (meaning that – on average – they select the correct point, with some unknown variance). I set the mean of the uniform to 0 also, but it is not used anywhere so I don’t think that matters. I am trying to use truncation to deal with the upper and lower boundaries but plan on playing with censoring as an alternative later on.

So, the problem I am running into is twofold. If I remove truncation, the model fits just fine. However, once I add in truncation – it runs into immediate problems finding starting values, throwing errors like:

Rejecting initial value:
Gradient evaluated at the initial value is not finite.
Stan can't start sampling from this initial value.

Sometimes, it does find starting values that work for a given chain – sometimes even for all chains – but it then returns a number of divergent transitions equal to the number of post warm-up samples even with adapt_delta = .99. The problem emerges as I add more predictors and becomes particularly intractable once the random intercept is added. I must be messing up the implementation in some way that I can’t quite put my finger on how.

Any advice would be most appreciated!

dat.csv (383.5 KB)

sample_code.R (3.0 KB)

Operating System: MacOS 10.14
brms Version: 2.7.0


#4

I implemented this functionality a few months ago. Glad that it works now! Indeed, you just need the _lcdf to get truncation working if you have both upper and lower bounds.