Conditional effects from stan_clogit fit

benskov · August 21, 2020, 8:03am

Dear all,

I’ve used rstanarm::stan_clogit to estimate the effect of a certain intervention on mortality. In short, our design is a matched risk-set case-control study with 1 case and 5 controls per risk set/stratum. I used rms::rcs as suggested previously but I’m now struggling a bit with getting conditional effects of the splines. brms (which has the nice conditional_effects function) doesn’t support the conditional logistic likelihood, as far as I know, which is why I’m sticking with rstanarm. It is possible to re-cast the conditional logistic regression as a Cox regression, but that requires defining the strata, and I can’t seem to find out how to do that in brms.

I feel like I’m missing something obvious, but since I’m quite stuck, I wanted to reach out to the community and hopefully get some pointers on moving forwards with this.

So, to continue with the example from my other post:

library(dplyr)
library(rstanarm)
library(rms)
dat <- arrange(infert, stratum) %>% 
	filter(parity <= 2) # yields 1 case and 2 controls per stratum
post <- stan_clogit(case ~ rcs(age) + spontaneous + induced, 
                    strata = stratum, data = dat, QR = TRUE)

As per the posterior_predict documentation,

Drawing from the posterior predictive distribution at interesting values of the predictors also lets us visualize how a manipulation of a predictor affects (a function of) the outcome(s).

All good, but the Details section specifies,

For models estimated with stan_clogit , the number of successes per stratum is ostensibly fixed by the research design. Thus, when doing posterior prediction with new data, the data.frame passed to the newdata argument must contain an outcome variable and a stratifying factor, both with the same name as in the original data.frame. Then, the posterior predictions will condition on this outcome in the new data.

My question is, then, if I wanted to see how age affects the outcome (e.g. over a grid in [18:45]) for women with spontaneous = 0 and induced = 0, how would I go about it? That is, what newdata should I give the posterior_predict? Or am I completely off track and need to do something else? I’m running rstanarm v2.18.2.

Cheers,
Ben

bgoodri · August 22, 2020, 5:04pm

I would use the latest rstanarm to get the posterior_epred() function, but it is equivalent to posterior_linpred(*, transform = TRUE) in earlier versions of rstanarm. Anyway, you could do something like

nd <- data.frame(age = 18:45, spontaneous = 0, induced = 0, stratum = "1", 
                 case = c(1, rep(0, 27))))
mu <- posterior_epred(post, newdata = nd)
summary(rowSums(mu)) # all 1.0

benskov · August 27, 2020, 4:21pm

Good stuff, thanks! The stuff about the case column threw me a bit off, but if I understand correctly it doesn’t matter which of the newdata-patients is given the status as case, because it’s solely needed to make the conditional likelihood behave a give correct predictions. Or am I wrong about that?

In general, @bgoodri, thanks for all your help with this! If it’s okay with you, I’d like to acknowledge your help in our manuscript if you don’t mind. If you would prefer for me to not do that, that’s of course very fine – I’ll only do it if I get explicit consent :-)

bgoodri · August 28, 2020, 2:52pm

I compared the above mu with another one that puts the case at the end

nd2 <- data.frame(age = 18:45, spontaneous = 0, induced = 0, stratum = "1", 
                  case = c(rep(0, 27), 1))
mu2 <- posterior_epred(post, newdata = nd2)

and they were identical. So, I think you are good. I just wouldn’t try to do anything afterward that assumed a different number of successes in a stratum than what the model conditioned on.

If you want to put something in a footnote, go ahead. Stan needs the credit more than I do.

benskov · August 28, 2020, 6:36pm

Yeah, my way to the conclusion was same. And I definitely won’t start mingling with other numbers of cases per stratum.

Re acknowledge: great! I already cite Stan in as many ways I can think of – it definitely needs all the publicity it can get.

benskov · September 7, 2020, 8:44am

I’ve bumped into two problems, and maybe they’re more appropriate for the Github repo, but just wanted to see if I’m completely on the wrong track first.

Prep work

library(rstanarm)
library(dplyr)

dat <- arrange(infert, stratum) %>% 
	filter(parity <= 2) %>% 
	mutate(test_covar = rnorm(n(), 10, 3))

# Shared newdata object
nd <- data.frame(age = 18:45, spontaneous = 0, induced = 0, stratum = "1", 
				 education = "0-5yrs", case = c(1, rep(0, 27)), test_covar = 8)

WORKS (equivalent to above)

post <- stan_clogit(case ~ rms::rcs(age) + spontaneous + induced, strata = stratum,
					data = dat, QR = TRUE)
mu <- posterior_epred(post, newdata = nd)

FAILS

posterior_epred doesn’t like factors when newdata has only one level. If I manually code the contrast it of course works, because the coefficient in essence becomes numerical

post2 <- stan_clogit(case ~ rms::rcs(age) + spontaneous + induced + education, strata = stratum,
					 data = dat, QR = TRUE)
mu2 <- posterior_epred(post2, newdata = nd)

Output:

Error in stanmat[, beta_sel, drop = FALSE] : subscript out of bounds

FAILS

posterior_epred, it seems, tries to create splines based on the nd argument (with only one value, so that’s of course impossible) instead of simply evaluating the single values based on the learnt spline coefficients.

post3 <- stan_clogit(case ~ rms::rcs(age) + spontaneous + induced + rms::rcs(test_covar), strata = stratum,
					 data = dat, QR = TRUE)
mu3 <- posterior_epred(post3, newdata = nd)

Output:

fewer than 3 unique knots.  Frequency table of variable:
x
 8 
28 
Error in rcspline.eval(x, nk = nknots, inclx = TRUE, pc = pc, fractied = fractied) : 
  
In addition: Warning message:
In rcspline.eval(x, nk = nknots, inclx = TRUE, pc = pc, fractied = fractied) :
  5 knots requested with 1 unique values of x.  knots set to -1 interior values.

Basically, should I open an issue or two at Github? Or am I on a wrong track?

Cheers,
Ben

bgoodri · September 9, 2020, 6:46pm

Yeah. Thanks.

benskov · September 11, 2020, 11:36am

Done, and thank you (everybody) for all your work!

Topic		Replies	Views
Splines in matched-data analysis with rstanarm rstanarm techniques , specification	3	822	April 21, 2020
Robust conditional logistic regression with rstanarm::stan_clogit? rstanarm techniques	2	875	May 4, 2020
Another problem with "stan_clogit" and "posterior predict" RStan	1	305	March 29, 2023
Does rstanarm::plot_nonlinear produce marginal or conditional plots? rstanarm	2	538	April 6, 2020
Brms implementation of clogit? General	1	580	November 9, 2018

Conditional effects from stan_clogit fit

Prep work

WORKS (equivalent to above)

FAILS

FAILS

Related topics