Projpred: latent projection for hierarchical ordinal models

Hi everyone,

I am back with another question on projected variable selection for hierarchical models.

The developers kindly covered in detail different aspects of categorical models here. In the linked post, @fweber144 and @avehtari confirmed that the latent projection for hierarchical categorical models is not supported.

I am now looking at hierarchical ordinal models, and I wanted to check a few things.

When I create a hierarchical ordinal model using brms, perform appropriate posterior checks, create a reference model using refm_obj <- get_refmodel(ordinal.fit), and then run preliminary cv_varsel, I get a recommendation from projpred to use the latent projection. There is also the recommendation for hierarchical model to set nterms_max to n-1 of your predictors, which I did.

When I use refm_obj <- get_refmodel(ordinal.fit, latent =T), running the preliminary cv_varsel with the below code:

cvvs_fast_ordinal_multilevel <- cv_varsel(refm_obj, validate_search = FALSE, method = "forward", refit_prj = FALSE, nterms_max = 7)

produces a long series of warnings:

Warning messages:
1: In warn_prj_drawwise(mssgs_warns_capts, throw_warn = throw_warn_sdivmin) :
The following messages and/or warnings have been thrown by the current submodel fitter (i.e., the current draw-wise divergence minimizer):

Warning in checkConv(attr(opt, “derivs”), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00212214 (tol = 0.002, component 1)

2: In check_conv(outdmin, lengths(mssgs_warns_capts), do_check = do_check_conv) :
1 out of 20 submodel fits (there is one submodel fit per projected draw) might not have converged (appropriately). It is recommended to inspect this in detail and (if necessary) to adjust tuning parameters via ... (the ellipsis of the employed top-level function such as project(), varsel(), or cv_varsel()) or via a custom div_minimizer function (an argument of init_refmodel()). In the present case, the submodel fits are of class(es) c("lmerMod"). Documentation for corresponding tuning parameters is linked in section “Draw-wise divergence minimizers” of ?`projpred-package`. Current submodel formula (right-hand side): ~trust_factor + voice_factor + transparency_factor + interpersonal_treatment_factor + (1 | respondent_id) + (1 | location)
3: In loo_varsel(refmodel = refmodel, method = method, nterms_max = nterms_max, :

4: In warn_pareto(n07 = sum(pareto_k > 0.7), n05 = sum(0.7 >= pareto_k & :
In the calculation of the reference model’s PSIS-LOO CV weights, 2 (out of 1739) Pareto k-values are > 0.7 and 53 (out of 1739) Pareto k-values are in the interval (0.5, 0.7]. Moment matching (see the loo package), mixture importance sampling (see the loo package), and reloo-ing (see the brms package) are not supported by projpred. If these techniques (run outside of projpred, i.e., for the reference model only; note that reloo-ing may be computationally costly) result in a markedly different reference model ELPD estimate than ordinary PSIS-LOO CV does, we recommend to use K-fold CV within projpred.
5: In warn_pareto(n07 = sum(y_lat_E$pareto_k > 0.7), n05 = sum(0.7 >= :
In the recalculation of the latent response values, 41 (out of 1739) expectation-specific Pareto k-values are > 0.7 and 109 (out of 1739) expectation-specific Pareto k-values are in the interval (0.5, 0.7]. In general, we recommend K-fold CV in this case.
6: In loo_varsel(refmodel = refmodel, method = method, nterms_max = nterms_max, :
The projected draws used for the performance evaluation have different (i.e., nonconstant) weights, so using standard importance sampling (SIS) instead of Pareto-smoothed importance sampling (PSIS). In general, PSIS is recommended over SIS.

So we have multiple issues. It seems that the submodels may not have converged and Pareto k-values are bad. Does anyone have any experience with this? I can run this projection without the hierarchical structure with no problems. However, the data is hierarchical and so it would be better to perform the variable selection on the hierarchical model.

I would also like to ask if anyone knows if it’s possible to adapt the code provided here by @fweber144 in the same post I linked above. The code shows how to obtain the observation-wise contributions to the GMPD for categorical models - it would be very helpful to have something similar to use as a post-projection posterior check for ordinal models.

Thank you so much for any assistance!

Hi Melissa,

Thank you for the detailed description of your problem.

  • Could you provide a two-way contingency table for respondent_id and location (i.e., for the combination of these two variables)?
  • Could you try to add control = lme4::lmerControl(optimizer = "Nelder_Mead") to the cv_varsel() call?
  • Could you switch to the most recent GitHub version of projpred? There have been some improvements with regard to the Pareto-k warnings and also with regard to the use of PSIS / SIS compared to v2.8.0.

I’ll try to write some code for this asap.