New package for sensitivity analysis using importance sampling

cmccartan · September 9, 2020, 12:46am

If you’re like me, you know that checking the sensitivity of your inferences to priors or other likelihood details is important, especially in complex models. But re-compiling and re-fitting a Stan program for all possible combinations of such adjustments is time-consuming. Doing manual importance sampling is possible, but not particularly user-friendly, and to my knowledge, there aren’t any other alternatives.

To help with this, I’ve created an R package which uses Pareto-smoothed importance sampling to understand how model inferences change under alternative specifications. It’s available here and is perhaps best illustrated by the vignette which walks through a sensitivity analysis of the hierarchical 8-schools model.

Basically the package provides functions to (1) provide the alternative specifications you’d like to explore, (2) do the importance sampling, and (3) examine posterior quantities of interest for each of the specifications.

The workflow/API is still experimental, so I’m open to any suggestions in that direction, or any other comments!

bnicenboim · September 9, 2020, 6:17am

This looks great. I always wondered if one could use PSIS for sensitivity analysis with Bayes factor, using bridge sampling. Do you think that it would be possible?

torkar · September 9, 2020, 6:30am

Very nice package and I like the systematic way you test the effect of different priors. I would definitely be nice to have this together with brms… ;)

cmccartan · September 9, 2020, 7:05pm

Thank you! The package as-is will work with brms, provided you pass brms_obj$fit instead of just brms_obj. You can use extract_samp_stmts(brms_obj$fit) to get the names of Stan parameters & their sampling statements—these don’t generally match up with the brms output.

~~I’ll work on implementing direct handling of brms objects so that they can be passed directly.~~

Edit: I’d forgot I’d already implemented this for the main adjust_weights function—you can just pass a brms fit object directly to it. I’ll add this same functionality to the other functions soon.

torkar · September 10, 2020, 6:16am

I’ll definitely try this out for the next analysis!

Guido_Biele · September 10, 2020, 8:13am

This looks to be useful, @cmccartan.

I have so far only red the vignette, so I hope this question is not ignorant:

I wonder if one important condition for getting useful weights is that the baseline-model could successfully explore the posterior (no divergences, no problems with BMFI, …). What do you think?

cmccartan · September 14, 2020, 8:10pm

The validity of the original samples is conditional on successful exploration, and since the importance sampling just re-weights the original samples, the validity of the alternative fits is also conditional on there being no problems with the original sampling.

However, even with a well-specified and well-explored original model, if the alternative models are too different, the distribution of the importance weights can still be problematic and lead to unreliable and highly variable estimates (which can be diagnosed in part with the Pareto k statistic).

So successful exploration is necessary but not sufficient.

petrovska · April 25, 2021, 7:20am

Thank you for the package,

I tried it with brms negative binomial hurdle model and it flaged out an error when I coded adjust_weights (model$fit):

Error in str2lang(x) : :1:35: unexpected ‘[’
1: Y ~ hurdle_neg_binomial_log_logit([`

Is it available for all classes of brms models?

saudiwin · April 25, 2021, 8:14am

This is black magic! (in a totally good, morally acceptable way).

One question - will this method work with latent variables more generally? I.e. if you’re using the 8-school examples it works with hierarchical parameters, but am wondering whether latent quantities would interfere with the importance sampling.

Also is there an associated paper?

cmccartan · April 26, 2021, 2:24am

Yes, it should work for all brms models! Did you provide a set of specifications (with make_spec()) to the adjust_weights function too, or just the model$fit? If you have a reproducible example that’s not working, I’m happy to dig into it to figure out what’s going on.

cmccartan · April 26, 2021, 2:26am

Yes, this will work with latent variables as well, though of course the farther away your alternate specification is from the fit model the less accurate the importance sampling estimates will be.

As far as a paper, this package is just doing plain importance sampling + pareto smoothing on the weights, so the PSIS paper is probably the most relevant citation (Vehtari, A., Simpson, D., Gelman, A., Yao, Y., & Gabry, J. (2015). Pareto smoothed importance sampling. arXiv preprint arXiv:1507.02646). The only thing the package adds to that is the ability to do this automatically with Stan models.

saudiwin · April 26, 2021, 4:52am

Yes I had seen that earlier paper, this still strikes me as a fairly original application of importance sampling. Perhaps though I just did not read it close enough. In any case, putting it all together in a function is very helpful!

petrovska · June 18, 2021, 3:08pm

Unfortunately, brms hurdle Poisson model did not work with the package.
My model is:

p1=c(prior_string("student_t(5,0,0.35)", class="Intercept"),
     prior("normal(0,0.2", class="b"),
     prior("student_t(5,0,0.21)", class="sd"),
     prior("logistic(-3,0.03)", class="Intercept", dpar="hu"),
     prior("normal(0,0.03", class="b", dpar="hu"),
     prior("student_t(5,0,0.03)", class="sd", dpar="hu"))

f2=brmsformula(count~ba_sqrt+dbh_log+species.p+species.m+dead_neigbour+y+(1+y|subplot_id)+(1|id.m),
               hu~ba_sqrt+species.p+y+(1+y|subplot_id)+(1|id.m), center=TRUE, decomp="QR")
fit2=brm(f2, data=d, family=hurdle_poisson(link = "log"), save_pars=save_pars(all = TRUE), cores  = 4, iter = 1000 + 4000, warmup = 1000, chains = 4, seed=123,
         sample_prior="yes",prior=p1, silent=TRUE, open_progress=FALSE, control = list(adapt_delta = 0.85, max_treedepth = 10))

extract_samp_stmts(fit2) # or
extract_samp_stmts(fit2$fit)

when I would like to extract the name of parameters, it gave me an error:

Error in str2lang(x) : <text>:1:30: unexpected '['
1: Y ~ hurdle_poisson_log_logit([
                                 ^

How to overcome this?

cmccartan · June 18, 2021, 5:20pm

Seems like something in the brms model specification is throwing it off.

Could you post a reproducible example (i.e, with data & other variables defined) as an issue: Issues · CoryMcCartan/adjustr · GitHub?
If not, sharing the generated brms model code would be helpful.

Topic		Replies	Views
Priorsense package powerscale_sensitivity function on brmsfit object returns error brms techniques , specification , brms	13	367	July 15, 2024
Bayes factor using brms brms marginal-likelihood , bayes-factor	23	6489	November 14, 2021
Brms variational inference and importance_resampling brms	1	39	June 21, 2025
Survey weights in brms/stan - Simulation based on design effect, feedback sought! Modeling techniques	12	2550	September 9, 2022
How to update a brmsfit object with a modified brms-generated stan model marginalizing over a distribution of weights Modeling techniques , specification	9	2770	December 20, 2021

New package for sensitivity analysis using importance sampling

Related topics