Gaussian process distributional parameter in brms?

I am working on implementing a relatively simple hazard model in brms in which there needs to be a smoothly varying baseline hazard, i.e. implemented using a Gaussian process or spline. Ideally, I would like this to be a separate dpar of length T, where T is the length of the observation period, and I could then integrate this with the individual hazard ratios in mu in a custom_family likelihood function. But I have not yet been able to figure out how to specify a vector-valued parameter that doesn’t correspond to any specific observations. One thought I had was to use ‘subset’ and create a set of dummy rows that the outcome will be computed against (i.e. a bunch of zeroes for the outcome, and the time as the input for the GP/spline), and then just use this in the true likelihood. But I didn’t know if there might be a better way of doing this that didn’t involve this kind of hacky workaround.

Thanks!

-Jon Zelner

  • Operating System: Debian
  • brms Version: 2.7

I am not sure there is a “nice” implementation of hazard models with functional baseline hazard in brms right now, but it is at least one my to do list for a native family (https://github.com/paul-buerkner/brms/issues/230).

If you can convince s() to take a matrix input and evaluate the a spline per observation then you could use the same data per row to have a spline constant across observations. The mgcv package, which brms uses to set up the splines, has a lot of functionality so there may be a chance that there is an option for this.

Hi Paul,

Thanks for the quick reply! So, the idea would be to give it a matrix w/rows 1,2,3,4,…,T repeating and then have it fit the same spline to each, and use that as an input to the LL for each observation? I can take a look at that; thanks!

I wonder if this problem might be made easier by modeling the cumulative hazard using a monotone function, and then passing the cumulative values over each increment to a zero and one outcome corresponding to each event? This would be analogous to a cox model without time-varying covariates.

best,

Jon

I am not an expert for these kinds of models but a very general approach is to represent the cox model via an expanded poisson likelihood. I believe there are some examples in the brms issue tracker and more generally on the internet abot this.

Thanks - the problem w/the Poisson approach is that it doesn’t work particularly well for rare outcomes for discrete individuals if you want to stick to a pretty low time-resolution. More generally, is there a way to define a latent vector-valued GP parameter without tying it to specific observations, but instead using it analogously to sigma for a Gaussian outcome?

-Jon

Not that I am aware of, but this may be a good feature request (https://github.com/paul-buerkner/brms) if we can write down a proper specification for the desired model class.

Will do! Going to try a custom family version of this; seems doable by including my own GP code in stanfuns and then bringing that into the LL?

Feel free to try it out. Would be happy to hear how it worked out.