How can I write a hurdle ordinal model custom family in brms?

PhDemetri · December 2, 2022, 5:03pm

I’m interested in writing a hurdle ordinal regression model as a brms custom family. I’ve successfully implemented something in Stan using the following model:

data{
  int n;
  int n_condition;
  int condition[n];
  int y[n];
  int n_y;
  int weight[n];
}
transformed data{
  int num_cuts = n_y - 2;
  int K = n_y-1;
}
parameters{
  real<lower=0, upper=1> theta[n_condition];
  ordered[num_cuts] cuts[n_condition];
}
model{
  theta ~ beta(2, 2);
  
  for(nn in 1:n){

  if(y[nn] == max(y)){
    target += weight[nn] * log1m(theta[condition[nn]]);
  }
  else{
    target += weight[nn] *  (log(theta[condition[nn]]) + ordered_logistic_lpmf(y[nn] |  0, cuts[condition[nn]])) ;  
  }
  }
  
}
generated quantities{
  
  array[n_condition, K] real proba;
  array[n_condition, K] real logit_proba;


  
  for(i in 1:n_condition){
    for(k in 1:K){
      if(k==1){
        proba[i, k] = inv_logit(cuts[i, 1]);
      }
      else if(k==K){
        proba[i, k] = 1 - inv_logit(cuts[i, K-1]);
      }
      else{
        proba[i, k] = inv_logit(cuts[i, k]) - inv_logit(cuts[i, k-1]);
      }
      
    }
  }
  
  logit_proba = logit(proba);
}

In short, I’ve coded outcomes which fail to pass the hurdle as the largest category (this is arbitrary and could be recoded so that they are 0). For these outcomes, I add \log(1-\theta) to the target density. Here, \theta is the probability of passing the hurdle).

For the reamining outcomes, I use an ordered logistic likelihood, where the cuts vary by condition (essentially a group identifier). The data are counts of outcomes, and so the contributions to the target density are weighted by the counts.

I’m attempting to write this model in brms. Following some of the documentaiton, I’ve written a custom family as…

hurdle_ordinal <- custom_family(
  'hurdle_ordinal', dpars = c('theta', 'mu', 'Intercept'),
  links = c('logit', 'identity', 'identity'),
  type = 'int'
)

stan_density<-'
real hurdle_ordinal_lmpf(int y, real theta, real mu, ordered Intercept){
    if(y==0}
    return log1m(theta);
  }
  else{
    return log(theta) + ordered_logistic_lpmf(y |  mu, Intercept) ;  
  }
}'

stanvars <- stanvar(scode = stan_density, block='functions')


f<- brm(y|weights(weight) ~ condition, 
    data = brm_model_data,
    family = hurdle_ordinal,
    stanvars = stanvars)

But when I attempt to run the model I’m returned the following error:Error: Currently 'dirichlet' is the only valid prior for simplex parameters. See help(set_prior) for more details.

Clearly, there is a problem with the priors. Here is what is returned from get_prior

f<- bf(y|weights(weight) ~ condition, 
    family = hurdle_ordinal)

get_prior(f, data=brm_model_data)

  prior     class               coef group resp dpar nlpar   lb   ub       source
               (flat)         b                                                         default
               (flat)         b    conditionAugust                                 (vectorized)
               (flat)         b  conditionFebruary                                 (vectorized)
               (flat)         b   conditionJanuary                                 (vectorized)
               (flat)         b      conditionJuly                                 (vectorized)
               (flat)         b proba                                 (vectorized)
               (flat)         b     conditionMarch                                 (vectorized)
               (flat)         b       conditionMay                                 (vectorized)
               (flat)         b   conditionOctober                                 (vectorized)
               (flat)         b conditionSeptember                                 (vectorized)
 student_t(3, 2, 2.5) Intercept                                                         default
       logistic(0, 1)     theta                                          <NA> <NA>      default

I believe the Intercept class should have a student t prior, so I think that’s the problem.

Questions

Have I set up the custom family for my model correctly?
If not, how can I correctly set it up? Where is my error?

jsocolar · December 3, 2022, 3:26am

Not too familiar with ordered logistic regression, but the doc for ordered_logistic_lpmf seems to say that the argument c in ordered_logistic_lpmf(k | eta, c) should be an ordered vector of length at least 2. Your custom family is using Intercept as this argument, but for any row in the data Intercept will be a single real, not an ordered vector of length at least 2.

Solomon · December 8, 2022, 6:32pm

This doesn’t quite answer your question, but if you haven’t seen it, this GitHub issue is on topic: Feature request: adding a hurdle cumulative family · Issue #1429 · paul-buerkner/brms · GitHub

Stephen_Wild · January 18, 2023, 3:13pm

To follow up on this, a hurdle_cumulative() family is now in brms. You can use it by installing the dev version of brms from github. The hurdle value must be 0 or the lowest level in an ordered factor.

See an example here, though you’ll need to load brms normally rather than from local (loading locally was done to make sure it worked before merging forks).

Let me know what is broken.

Topic		Replies	Views
Ordinal Mixture Model w/ Set Outcome for One Component brms	28	2467	February 17, 2023
Custom gaussian hurdle family not quite working in brms brms	5	1634	March 31, 2022
Hierarchical ordered logit model in stan and brms Modeling rstan , brms	3	452	May 3, 2023
How do I reproduce the fit from conditional_effects function of brms package when using the Stan fit myself? brms	13	834	October 11, 2023
Choosing priors for a brms hurdle model brms specification , prior-choice , priors	2	1743	July 19, 2020

How can I write a hurdle ordinal model custom family in brms?

Questions

Related topics