Limit range of response variable

Hi!

I really enjoy all the features of the brms package!

Is there a way of constraining models fitted with brms s.t. either the parameters are estimated under the constraint that the response scale can only take values within the interval [0,100] (in virtue of being
probability ratings of the participants) or when generating posterior predictions from the fitted models?

In modelling probability ratings of the participants, I am fitting a cluster of brms models.
One might look like this:

q1 <- brm(value ~ RelevanceDVExp
+ (Relevance * DV| lfdn) + (Relevance * DV| le_nr),
data = dw_DV_exp,
prior = c(set_prior(“normal(0,10)”, class = “b”)),
cores = 4, iter = 20000,
save_all_pars = TRUE,
sample_prior = TRUE,
control=list(adapt_delta=0.99, stepsize = 0.01, max_treedepth =15))

Afterwards, I compare the models etc…, and calculate posterior predictions by:

newdata <- data.frame(DV = c(“DVa”,“DVa”,
“DVa”,“DVa”,
“DVb”,“DVb”,
“DVb”,“DVb”,
“DVc”,“DVc”,
“DVc”,“DVc”),
Relevance = c( “PO”, “PO”,
“IR”, “IR”,
“PO”, “PO”,
“IR”, “IR”,
“PO”, “PO”,
“IR”, “IR”),
Exp = c(“Exp1”, “Exp2”,
“Exp1”, “Exp2”,
“Exp1”, “Exp2”,
“Exp1”, “Exp2”,
“Exp1”, “Exp2”,
“Exp1”, “Exp2”),
lfdn = rep(146, 12))

y_rep_1 <- posterior_predict(q1, newdata, allow_new_levels = TRUE)

y_rep_1 is now a matrix containing 40000 rows with posterior predictions for the 12 conditions.
The problem is that y_rep_1 contains values within the interval [-130, 230] whereas the responses
that I am modelling are constrained to only take values within the interval [0,100].

Thanks!
Niels

Ps. this is a repost from GitHub; I can delete the thread there

  • Operating System: Windows 10
1 Like

I would recommend dividing the outcomes by 100 and using a beta likelihood. If some values are exactly 0 or 100, then the beta distribution is not very appropriate, but neither is the truncated normal you are suggesting.

Thanks for the reply!

Is this something like what you had in mind?

q1b ← brm(value ~ RelevanceDVExp

  •       + (Relevance * DV| lfdn) + (Relevance * DV| le_nr),
    
  •       family = Beta(), inits = 0,
    
  •       data = dw_DV_exp, 
    
  •       prior = c(set_prior("normal(0,10)", class = "b")),
    
  •       cores = 4, iter = 20000,
    
  •       save_all_pars = TRUE,
    
  •       sample_prior = TRUE,
    
  •       control=list(adapt_delta=0.99, stepsize = 0.01, max_treedepth =15))
    

(perhaps the priors on the beta coefficients should be changed as well then?)

Since I do have zero values, I get the following error message:
“Error: Family ‘beta’ requires response greater than 0.”

I suppose adding a small constant to the zeros would be inappropriate?

What is the recommended approach?

Thanks!

There isn’t a great solution. You could use this R function to push all the values interior to the [0,1] interval

transform <-
function (y, inverse = FALSE) 
{
    n <- length(y)
    if (inverse) 
        (y * n - 0.5)/(n - 1)
    else (y * (n - 1) + 0.5)/n
}

You can also try out the zero_inflated_beta family. See ?brmsfamily for details.

1 Like