Projpred: Interactions without marginal terms?

(Edited after I found the correct citation)

In my younger days, we learned from Bill Venable’s Exegeses (still available on the Internet) never to use interactions without the marginal terms, so weight_loss = bmi + bmi*sex was definitively not what the teacher expects.

I checked projpred today, and it happily returned such lonely interactions.

Any reasons?

From Venables/Ripley MASS, page 184 third edition:

A two-factor interaction a:b is marginal to any higher order interaction that contains a and b. Fitting a model such as a +a:b leasts to a model matrix where the columns corresponding to :b are extended to compensate for the absent marginal terms, b, and the fitte values are the same as if it were present. Fitting models with marginal terms removed such as with a*b-b generates a model whith no readily understood statistical meaning… In other words removing marginal factors terms from a fitted model is either statistically meangingless or futile in the sense that the model simply changes its parametrization to something equivalent.

The current projpred doesn’t know about the structure. The new version coming soon supports formula syntax and knows about interactions, “random effects”, and smooths.

3 Likes

I’m keen to use projpred on a brm model containing random effects, interactions and multi-level factors. Can you please tell me if these are available in projpred-1.1.6? I can’t tell from the package NEWS (https://cran.rstudio.com/web/packages/projpred/news/news.html). If not, will the GitHub master contain these additions?

Thanks very much for a great package; and an impressive amount of help via these forums.

Thanks you “everyone of the Stan developer community” for all the work made! Moreover, the projpred package is really boosting my everyday work.

3 Likes

Hi @andd, the new features are still only in the develop branch. The new features required major refactoring of the code and thus we need to be extra careful before making a new release. @AlejandroCatalina can provide a better estimate when that would be merged to master or point to an example how to use the features in the develop branch.

As @avehtari mentions, we have to be very careful when merging develop branch into master as it involves a very thorough rewriting of many core sections. That said, all the functionality should be complete for a few weeks already and we are planning to merge this into master soon (my personal estimate is before August). For a vignette showcasing the new functionality you can go to vignettes/quickstart_glmm.html. Do not hesitate in opening an issue to discuss whatever you don’t find clear enough for your purposes. Keep in mind that projpred's develop branch requires that you install brms's projpred branch.

2 Likes

Brilliant: thanks @avehtari and @AlejandroCatalina for the very quick response. I’ll try the develop branch.

Trying again some time later, with projpred 2.02. The model is

stan_glm(reflux_fu ~ (reflux_preop + weight_preop + z_age_op)*op_group, ...)

And this is summary(varsel)

  size        solution_terms elpd elpd.se
2    0                  <NA> -104     2.2
3    1 reflux_preop:op_group  -92     4.8
4    2              op_group  -92     4.8
5    3     z_age_op:op_group  -87     5.2
6    4          weight_preop  -86     5.3
7    5 weight_preop:op_group  -84     5.4
8    6          reflux_preop  -84     5.4
9    7              z_age_op  -84     5.4

So quite a few interaction terms without the marginals. Bill Venables: Fitting models with marginal terms removed such as with a*b-b generates a model whith no readily understood statistical meaning.

Please help me to understand where I got lost. As an example for my old understanding, dropterm(MASS) writes: Try fitting all models that differ from the current model by dropping a single term, maintaining marginality

Have you checked the vignette Performing variable and structure selection on Generalized Linear Multilevel Models and tried using stan_glmer?

I read everything you write at least twice before I put my hand on the keyboard ._) Almost true.

Everything is as expected with stan_glmer. I had not tried it, because I wanted to understand simple cases first, and got stuck understanding the logics behind glm variable selection. Shouldn’t the marginal rules apply to glm too?

I guess we didn’t assume someone would use _glm for this. Would you like to create an issue in projpred repo and ask that that _glm would also first include the main effects? Note that _glmer projection is using also different algorithm as the projection is not otherwise well identified. Tagging also @AlejandroCatalina

I will create an issue, but let me first better understand what is going on. I have one observation per subject, so I never thought of using (stan)_glmer with (1|subject_id). When I try this with glmer/lme4, Doug yells at me

In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.0167366 (tol = 0.002, component 1)

and I do fully understand his anger, since there simply is no variance material to hammer upon. stan_glmer works, but I am not surprised that the (1|subject_id) terms ends up last in varsel; the results otherwise look ok. So it is unclear why you did not think someone would use _glm for this. Is using glmer with a orphaned (1|subject_id) the better choice?

The example shown in the original posting looks like this with stan_glmer, which is much more reasonable.

   size          solution_terms elpd elpd.se
2     0                    <NA> -104     2.1
3     1                op_group  -95     4.4
4     2            reflux_preop  -92     4.8
5     3          z_weight_preop  -91     5.0
6     4 op_group:z_weight_preop  -88     5.3
7     5                z_age_op  -88     5.4
8     6       op_group:z_age_op  -84     5.6
9     7   op_group:reflux_preop  -84     5.6
10    8        (1 | subject_id)  -84     6.4

1 Like

Well, we didn’t think of this as conceptually what you want (if I understood it correctly) is just fit a multilevel model. The problem seems to be that you don’t have enough data to actually learn any group variances, but that’s, I would say, a second problem. In that case I understand projpred’s output for this model and they make sense.

It’s true that we should handle this somehow if the user fits an stan_glm object.