Maximum likelihood via rstanarm::stan_glm

philb · November 18, 2022, 3:53pm

Statement of problem

I’m trying to evaluate the statement at the top of page 6 of the rstanarm documentation which states, with regard to the choice of algorithm, that “if there is no prior information, then [algorithm=“optimizing”] is equivalent to maximum likelihood…”, but I’m having trouble verifying this.

Main question

My question is: Having set the priors to be “flat” by making them NULL, I would expect the parameter estimates to be the same as the MLE, but that is not the case in my example below (compare fit1 to mean(y) below)

Other details

For comparison, I also include a basic location model inrstan (fit2 below), for which I was successful in recapitulating the MLE.
I set the convergence criteria to be minuscule to ensure that this was not premature convergence.
stan_lm returns an error upon setting algorithm="optimizing" (Error in match.arg(algorithm) : 'arg' should be one of “sampling”, “meanfield”, “fullrank”), which is why I resorted to using stan_glm with an identity link.
I also tried setting initial values in stan_glm but wasn’t successful in that regard. The documentation is a little sparse in this regard.

Code

Here is the code:

library(rstanarm)
library(rstan)

set.seed(1)
y <- rnorm(10)

conv_crit =  .Machine$double.eps
iter = 2000

fit1 <- stan_glm(y ~ 1,  
                 prior = NULL, 
                 prior_intercept = NULL, 
                 prior_aux = NULL,
                 family = "gaussian",
                 algorithm = "optimizing", 
                 seed = 1, 
                 refresh = 1,
                 verbose = T,
                 iter = iter,
                 tol_obj = conv_crit,
                 tol_rel_obj = conv_crit / .Machine$double.eps,
                 tol_grad = conv_crit,
                 tol_rel_grad = conv_crit / .Machine$double.eps,
                 tol_param = conv_crit)


stancode <- 'data {int<lower=1> n; real y[n];} parameters {real y_mean; real<lower = 0> sigma;} model {y ~ normal(y_mean, sigma);}'
mod <- stan_model(model_code = stancode, verbose = TRUE)
fit2 <- optimizing(mod, 
                   data = list(n = length(y), y = y),
                   seed = 1, 
                   refresh = 1,
                   verbose = T,
                   tol_obj = conv_crit,
                   tol_rel_obj = conv_crit / .Machine$double.eps,
                   tol_grad = conv_crit,
                   tol_rel_grad = conv_crit / .Machine$double.eps,
                   tol_param = conv_crit)

coef(fit1) #stan_glm
#(Intercept) 
# 0.1117049 
mean(y) #mle
# [1] 0.1322028
fit2$par[["y_mean"]] # rstan optimization
# [1] 0.1322028

Session info

> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Big Sur 11.6.8

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rstan_2.21.5         ggplot2_3.3.6        StanHeaders_2.21.0-7
[4] rstanarm_2.21.3      Rcpp_1.0.8.3

philb · December 2, 2022, 3:22am

Solution: you can get the actual MAP (which equals MLE) by setting draws=0 in the call to stan_glm.

If the draws argument is not specified, then stan_glm.fit runs optimizing_args$draws <- 1000L. Later, the stanreg function calculates the median of these 1000 draws and uses this median as the reported estimate. Thus, when draws > 0, which is the default behavior of stan_glm with algorithm = "optimizing", you do not actually get the MAP returned. OTOH, if you set draws=0, you bypass sampling from the posterior entirely and get exactly the MLE as with the other approaches. I don’t think this is the intended behavior of stan_glm with algorithm="optimizing".

Topic		Replies	Views
What is stan_glm(algorithm = "optimizing") doing? rstanarm	1	557	September 7, 2020
New diagnostics for stan_glm and algorithm='optimizing' rstanarm	1	662	March 14, 2019
Stan_glm: defining prior distributions rstanarm	1	633	October 25, 2018
Rstanarm prior specification: stan_glm.nb() and stan_glm() Poisson rstanarm	6	1201	June 21, 2017
Converting unstandardized priors for rstanarm function arguments rstanarm priors , rstanarm	2	747	January 7, 2022