I want a lowess replacement in Stan (or rstanarm or brms)

andrewgelman · May 28, 2020, 5:58pm

People sometimes ask how they can help with Stan.

If you can program C++, it’s my impression that there’s lots to do.

Below is an example of a project that would be useful that only requires programming in Stan (and some R, I guess).

We can use lowess to fit smoothed curves through data (it will do better than the sort of curve fitting favored by the U.S. government: https://statmodeling.stat.columbia.edu/2020/05/14/so-much-of-academia-is-about-connections-and-reputation-laundering/). But lowess has problems too, first because it’s not Bayesian so it’s difficult to include prior information, get uncertainty estimates, etc., and second because it’s more of a procedure than a model so it’s hard to use it as a component in larger models for example including measurement error, varying coefficients, etc.

This came up in a recent example with polling data where we wanted to fit a smooth curve through some survey estimates over time, and the smooth curve fit by the computer did all sorts of weird things.

For these reasons it would be good to have a “lowess equivalent” in Stan, maybe using splines or Gaussian processes (but not with the scaling problems of Gaussian processes as usually implemented), maybe building off existing options in rstanarm or brms) that would do the following three things:

It would run out of the box using its default settings and produce a smooth curve (a posterior summary such as a pointwise median, along with many posterior draws of the curve) that could in one line be plotted along with the data.
Like lowess, it would have one primary tuning parameter that could be set by the user. (Unlike lowess, this function in its default setting would average over or estimate the tuning parameter from the data.)
The model can include multiple predictors. I’m not sure how general the function should be here. I guess a starting point would be whatever lowess can do in that regard.

bgoodri · May 28, 2020, 7:00pm

Those have existed in both rstanarm and brms for years, except for (2) because you get the posterior distribution of the smoothness hyperparameters. Although they do (3), for the plots you refer to in (1), a choice has to be made about what values of the other predictors to graph the smooth function at.

andrewgelman · May 28, 2020, 8:10pm

I guess i need a vignette, then!

andrewgelman · May 28, 2020, 9:28pm

Also, we could allow (2) in this hypothetical function by allowing options for strong priors for the smoothness hyperparameters.

bgoodri · May 29, 2020, 1:37am

We have vignettes

http://mc-stan.org/rstanarm/articles/glmer.html#relationship-to-gamm4

You also wrote a paper about life expectancy that used them.

andrewgelman · May 29, 2020, 1:40am

Really? I don’t remember that paper!

bgoodri · May 29, 2020, 1:43am

https://statmodeling.stat.columbia.edu/2017/03/30/aggregate-age-adjusted-trends-death-rates-non-hispanic-whites-minorities-u-s/

andrewgelman · May 29, 2020, 1:45am

Oh yeah, that! We never got around to writing this up as a paper, unfortuantely!

Topic		Replies	Views
Splines in Rstanarm General	4	900	April 23, 2021
Stancode in brms brms	5	809	July 16, 2020
Brms paper published Publicity	2	1248	September 23, 2017
Function like rstanarm::posterior_predict() for models written in Rstan? Interfaces rstan	4	402	November 4, 2020
Easy way to do undertake simple regularised regression in rstanarm/brms? rstanarm	1	502	April 11, 2021

I want a lowess replacement in Stan (or rstanarm or brms)

Related topics