Limit range of response variable


#1

Hi!

I really enjoy all the features of the brms package!

Is there a way of constraining models fitted with brms s.t. either the parameters are estimated under the constraint that the response scale can only take values within the interval [0,100] (in virtue of being
probability ratings of the participants) or when generating posterior predictions from the fitted models?

In modelling probability ratings of the participants, I am fitting a cluster of brms models.
One might look like this:

q1 <- brm(value ~ RelevanceDVExp
+ (Relevance * DV| lfdn) + (Relevance * DV| le_nr),
data = dw_DV_exp,
prior = c(set_prior(“normal(0,10)”, class = “b”)),
cores = 4, iter = 20000,
save_all_pars = TRUE,
sample_prior = TRUE,
control=list(adapt_delta=0.99, stepsize = 0.01, max_treedepth =15))

Afterwards, I compare the models etc…, and calculate posterior predictions by:

newdata <- data.frame(DV = c(“DVa”,“DVa”,
“DVa”,“DVa”,
“DVb”,“DVb”,
“DVb”,“DVb”,
“DVc”,“DVc”,
“DVc”,“DVc”),
Relevance = c( “PO”, “PO”,
“IR”, “IR”,
“PO”, “PO”,
“IR”, “IR”,
“PO”, “PO”,
“IR”, “IR”),
Exp = c(“Exp1”, “Exp2”,
“Exp1”, “Exp2”,
“Exp1”, “Exp2”,
“Exp1”, “Exp2”,
“Exp1”, “Exp2”,
“Exp1”, “Exp2”),
lfdn = rep(146, 12))

y_rep_1 <- posterior_predict(q1, newdata, allow_new_levels = TRUE)

y_rep_1 is now a matrix containing 40000 rows with posterior predictions for the 12 conditions.
The problem is that y_rep_1 contains values within the interval [-130, 230] whereas the responses
that I am modelling are constrained to only take values within the interval [0,100].

Thanks!
Niels

Ps. this is a repost from GitHub; I can delete the thread there

  • Operating System: Windows 10

#2

I would recommend dividing the outcomes by 100 and using a beta likelihood. If some values are exactly 0 or 100, then the beta distribution is not very appropriate, but neither is the truncated normal you are suggesting.


#3

Thanks for the reply!

Is this something like what you had in mind?

q1b <- brm(value ~ RelevanceDVExp

  •       + (Relevance * DV| lfdn) + (Relevance * DV| le_nr),
    
  •       family = Beta(), inits = 0,
    
  •       data = dw_DV_exp, 
    
  •       prior = c(set_prior("normal(0,10)", class = "b")),
    
  •       cores = 4, iter = 20000,
    
  •       save_all_pars = TRUE,
    
  •       sample_prior = TRUE,
    
  •       control=list(adapt_delta=0.99, stepsize = 0.01, max_treedepth =15))
    

(perhaps the priors on the beta coefficients should be changed as well then?)

Since I do have zero values, I get the following error message:
“Error: Family ‘beta’ requires response greater than 0.”

I suppose adding a small constant to the zeros would be inappropriate?

What is the recommended approach?

Thanks!


#4

There isn’t a great solution. You could use this R function to push all the values interior to the [0,1] interval

transform <-
function (y, inverse = FALSE) 
{
    n <- length(y)
    if (inverse) 
        (y * n - 0.5)/(n - 1)
    else (y * (n - 1) + 0.5)/n
}

#5

You can also try out the zero_inflated_beta family. See ?brmsfamily for details.