Splines in matched-data analysis with rstanarm

Dear all,

I’m preparing case-control study with a binary outcome in which I’d like to model the numerical covariates with splines; I’ll have 3-5 controls per case (we didn’t decide finally yet). The model should account for the matched nature of the data, so normally I’d go for stan_clogit and use its strata argument. However, because I’d like to use splines, I was planning on using stan_gamm4 which doesn’t seem to support conditioning on stratum out-of-the-box.

Thus, I was wondering if others have done this before and might be able to give some advise on the best approach. I suppose it would be possible to simply pre-compute the data matrix for the smoothers and use them in stan_clogit but it would be neat to be able to use the machinery of stan_gamm4, e.g. plot_nonlinear. Including stratum as a random-effects intercept might be another approach, but it would change the nature of the estimates.

Might it be possible to actually condition on strata with stan_gamm4 (perhaps in a future release)?


  • Operating System: Mac OS X Catalina
  • rstanarm Version: 2.19.2
1 Like

I can do a stan_clogit with a restricted cubic spline like

dat <- infert[order(infert$stratum), ] # order by strata
post <- stan_clogit(case ~ rms::rcs(age) + spontaneous + induced + 
                     (1 | education), strata = stratum,
                    data = dat, subset = parity <= 2, QR = TRUE)

but using mgcv::s() does not seem to work with stan_clogit. Perhaps that could be fixed. I realize that you would rather call stan_gamm4 with a clogit likelihood but that seems more difficult since it internally goes through mgcv::jagam these days.


Thanks, @bgoodri! Excellent solution (even better than pre-computing the splines) especially because I wanted to use restricted cubic splines but hadn’t thought of rms; I’m still quite new to splines. Not using stan_gamm4 isn’t a big problem, but would be cool if stan_clogit were to support mgcv::s(). Should I make an issue at Github?

Go ahead