Bridgesampling with rstanarm?

Robert_Feldt · November 17, 2017, 6:40am

Given the recent blog post about the bridgesampling R package I wanted to try it on some rstanarm stanreg objects. However, at least for a poisson glm fit, not all information that bridgesampling needs is retained, i.e.:

> bridge_sampler(mvr_stan_poisson1$stanfit)
Error in object@.MISC$stan_fit_instance$unconstrain_pars(pars) : 
  variable gamma missing

Is there some way to get around this or can I only use this with “manually” written Stan models?

anon75146577 · November 17, 2017, 6:43am

Can you give a minimal example that does this? (ie all the code and data for something that gives the same error)

Otherwise it’s hard to diagnose.

Robert_Feldt · November 17, 2017, 6:47am

The full dataset used for this is a bit large but the stan_glm call is this:

> mvr_stan_poisson1 <- stan_glm(Fix ~ Experience + Ability + treatment + System + Lab, 
                      data = mvr, family = poisson, 
                      prior = normal(0,20.0), prior_intercept = normal(0,20.0),
                      chains = CHAINS, cores = CORES, seed = SEED, iter = ITER)
> bridge_sampler(mvr_stan_poisson1$stanfit)
Error in object@.MISC$stan_fit_instance$unconstrain_pars(pars) : 
  variable gamma missing

I’ll see if I can cut down the dataset to something smaller/minimal that has the same behavior.

Should I understand you as bridgesampling on rstanarm models are generally supported and my failure here depends on some specifics about my model or dataset?

Robert_Feldt · November 17, 2017, 6:51am

Actually, the same behavior happens for warpbreaks dataset included in R:

> breaksmodel<-stan_glm(breaks~wool*tension, warpbreaks, family=poisson)
> bridge_sampler(breaksmodel$stanfit)
Error in object@.MISC$stan_fit_instance$unconstrain_pars(pars) : 
  variable gamma missing

bgoodri · November 17, 2017, 2:48pm

From ?bridge_sampler

Note
To be able to use a stanreg object for samples, the user crucially needs to have specified the diagnostic_file when fitting the model in rstanarm.

See the diagnostic_file argument to rstan::sampling that is passed through the ... by model fitting functions in rstanarm.

Henrik_Singmann · November 17, 2017, 7:56pm

A minimally working example (from ?bridgesampling and written by Ben Goodrich):

library(bridgesampling)
library(rstanarm)

# N.B.: remember to specify the diagnostic_file

fit_1 <- stan_glm(mpg ~ wt + qsec + am, data = mtcars,
                  chains = 2, cores = 2, iter = 5000,
                  diagnostic_file = file.path(tempdir(), "df.csv"))
bridge_1 <- bridge_sampler(fit_1)
fit_2 <- update(fit_1, formula = . ~ . + cyl)
bridge_2 <- bridge_sampler(fit_2, method = "warp3")
bf(bridge_1, bridge_2)

Robert_Feldt · November 19, 2017, 8:48pm

Thanks. Sorry for missing this in the docs.

jrrae · March 28, 2018, 8:30am

When I run your code I get a Bayes factor of ~ 10. While I know that the labels used to describe Bayes factors are arbitrary, I often see values of 10 referred to as “strong evidence”.

By contrast, I added a couple lines of code (below) to compare these these models using leave one out cross validation (LOO). I don’t find much difference between models (i.e., less than a 1 SE difference between models). Is it odd that these two methods paint such a different picture?

library(bridgesampling)
library(rstanarm)

# N.B.: remember to specify the diagnostic_file

fit_1 <- stan_glm(mpg ~ wt + qsec + am, data = mtcars,
                  chains = 2, cores = 2, iter = 5000,
                  diagnostic_file = file.path(tempdir(), "df.csv"))
bridge_1 <- bridge_sampler(fit_1)
fit_2 <- update(fit_1, formula = . ~ . + cyl)
bridge_2 <- bridge_sampler(fit_2, method = "warp3")
bf(bridge_1, bridge_2)

loo_1 <- loo(fit_1)
loo_2 <- loo(fit_2)
compare_models(loo_1,loo_2)

Henrik_Singmann · March 28, 2018, 8:53am

The ‘art’ of marginal likelihood or Bayes factor based model selection is in choosing appropriate parameter priors. Stated more dramatically, this type of model selection extremely strongly hinges on the parameter priors (e.g., Lindley’s paradox). This is probably one of the reasons why many statisticians (among them at least some of the more vocal people behind the development of Stan, example here) do not advocate the use of Bayes factors. This may also be one of the reasons why they have not been adopted very widely.

So how does this relate to the example? In this example, the parameter priors are not chosen for Bayes factor based model selection so it is not clear if the results are anyhow sensible. They are probably not.

In general, priors that are appropriate for estimation are often not appropriate for Bayes factors. There is quite a bit of literature on how to choose such priors of which I list some below. However, I think it is fair to say that the priors implemented in rstanarm are generally not appropriate for this task.

Rouder, J. N., Morey, R. D., Speckman, P. L., & Province, J. M. (2012). Default Bayes factors for ANOVA designs. Journal of Mathematical Psychology, 56(5), 356–374. https://doi.org/10.1016/j.jmp.2012.08.001
Bayarri, M. J., & García-Donato, G. (2007). Extending Conventional Priors for Testing General Hypotheses in Linear Models. Biometrika, 94(1), 135–152.
Ly, A., Verhagen, J., & Wagenmakers, E.-J. (2016). Harold Jeffreys’s default Bayes factor hypothesis tests: Explanation, extension, and application in psychology. Journal of Mathematical Psychology, 72(Supplement C), 19–32. https://doi.org/10.1016/j.jmp.2015.06.004

We also discuss some prior examples in our bridgesampling paper: http://arxiv.org/abs/1710.08162

jrrae · March 28, 2018, 9:46am

This is a very helpful answer - much appreciated.

You say that priors appropriate for estimation are often not appropriate for Bayes factors. I was hoping that one of the papers you listed would provide a more full explanation of why this is the case (I didn’t see anything like this in my quick perusal of each paper). Do you happen to know of a paper that discusses this issue a bit more? While I’ve given thought to choosing priors for estimation, I’m a little unsure how or why these need to be adjusted for computing Bayes factors.

wds15 · March 28, 2018, 10:41am

See here why it’s hard to get numerically

and Lindley’s paradox ties into this (from a different angle though).

Henrik_Singmann · March 28, 2018, 1:32pm

I mean, they all kind of talk about that. However, a paper that more specifically compares different priors and talks about estimation versus Bayes factors is:
Rouder, J. N., Haaf, J. M., & Vandekerckhove, J. (2018). Bayesian inference for psychology, part IV: parameter estimation and Bayes factors. Psychonomic Bulletin & Review, 25(1), 102–113. https://doi.org/10.3758/s13423-017-1420-7

Maybe also relevant: https://psyarxiv.com/wmf3r/

bgoodri · March 29, 2018, 3:40am

I can confirm that we did not think at all about Bayes factors when choosing the default priors in rstanarm, but some form of hs() or product_normal() might work well.

Topic		Replies	Views
Bridge sampling is not compatible with the `stan_surv` object Modeling rstanarm , bayes-factor , brms , bridgesampling	0	27	September 11, 2024
Trying to get a Bayes Factor from an rstanarm glm Modeling specification , bayes-factor	7	2073	November 19, 2019
Issue with bridgesampling::bridge_sampler() Modeling rstan , loo	1	50	April 3, 2025
Trying to specify intercept only model rstanarm	14	1450	December 22, 2019
Bridge sampling with the latest rstan (yet again not working) RStan bayes-factor , bridgesampling	5	1160	April 8, 2022

Bridgesampling with rstanarm?

Related topics