Handling Missing Step Parameters in brms (acat family) Without Dummy Data

Hi everyone,

I’m working on a generalized partial credit model using brms with the acat family (logit link) as the following:

gpcm_formula <- bf(
  R2 | thres(gr = Item) ~ 1 + ( 1 | Person),  # Thresholds and latent trait
  disc ~ 0 + Item                    # Varying discrimination                    
)

gpcm_model_real <- brm(
  formula = gpcm_formula,
  data = dat,
  family = brmsfamily("acat", "logit"),
  prior = my_priors_df,
  cores = 2,
  chains = 2
)

I generated my custom prior list using get_prior based on a full dataset that includes all response categories:

prior_list <- get_prior(formula=gpcm_formula, data = dat_long_full_response, family = brmsfamily("acat", "logit"))

However, when I fit the model using only a small sample of the full data, some response categories are unobserved for certain items. My intention is that by imposing strong priors, the likelihood would essentially defer to the prior for those missing step parameters.

Unfortunately, I encounter the following error related to the step parameters with missing categories:

Error: The following priors do not correspond to any model parameter: 
Intercept_F_125__2 ~ normal(3.10762643814087,1)
Intercept_F_126__2 ~ normal(3.86018347740173,1)
Intercept_F_1262__2 ~ normal(5.34091329574585,1)
Intercept_F_209__4 ~ normal(3.16134572029114,1)
Intercept_F_219__2 ~ normal(3.22295665740967,1)
Intercept_F_215__2 ~ normal(3.20417022705078,1)
Function 'default_prior' might be helpful to you.

It appears that when a response category is unobserved in the data, the corresponding step (model) parameter is not generated by brms. Consequently, the custom prior for that parameter becomes “orphaned” and therefore errors.

I understand that one workaround is to add dummy data (or dummy responses) for those missing response categories to force the estimation of all step parameters. However, I am interested in whether there’s a more seamless solution within brms or Stan to handle this situation without having to manually add dummy observations.

Specifically, I would like to know:

  • Can brms be configured to “force” the model to generate (model) parameters even if a particular response category is unobserved?
  • Are there alternative parameterizations or modeling strategies (e.g., “prior-only” estimation for missing categories) that would allow the prior to dominate when no data are available for a given step?

Any insights, alternative approaches, or recommendations on handling this issue would be greatly appreciated.

Thanks in advance for your help!

Best regards,

Sue

1 Like

Yes, required number of thresholds cam be specified with the |thres addition term. From the brmsformula docs:

For all ordinal families, aterms may contain a term thres(number) to specify the number thresholds (e.g, thres(6)), which should be equal to the total number of response categories - 1. If not given, the number of thresholds is calculated from the data.

1 Like

Hello @martinmodrak. Thank you for sharing your valuable insights and suggestions! I really appreciate it!

In my case, items have varying number of thresholds so I had to rely on get_prior to get the list of parameters to estimate. So fixing the number of threshold that way would not work in my case, I believe.

This should still be possible - the rest of the paragraph from the docs (emphasis mine):

If different threshold vectors should be used for different subsets of the data, the gr argument can be used to provide the grouping variable (e.g, thres(6, gr = item), if item is the grouping variable). In this case, the number of thresholds can also be a variable in the data with different values per group.

@martinmodrak This worked great. Thank you so so much!!!

1 Like