New package for sensitivity analysis using importance sampling

If you’re like me, you know that checking the sensitivity of your inferences to priors or other likelihood details is important, especially in complex models. But re-compiling and re-fitting a Stan program for all possible combinations of such adjustments is time-consuming. Doing manual importance sampling is possible, but not particularly user-friendly, and to my knowledge, there aren’t any other alternatives.

To help with this, I’ve created an R package which uses Pareto-smoothed importance sampling to understand how model inferences change under alternative specifications. It’s available here and is perhaps best illustrated by the vignette which walks through a sensitivity analysis of the hierarchical 8-schools model.

Basically the package provides functions to (1) provide the alternative specifications you’d like to explore, (2) do the importance sampling, and (3) examine posterior quantities of interest for each of the specifications.

The workflow/API is still experimental, so I’m open to any suggestions in that direction, or any other comments!

9 Likes

This looks great. I always wondered if one could use PSIS for sensitivity analysis with Bayes factor, using bridge sampling. Do you think that it would be possible?

Very nice package and I like the systematic way you test the effect of different priors. I would definitely be nice to have this together with brms… ;)

Thank you! The package as-is will work with brms, provided you pass brms_obj$fit instead of just brms_obj. You can use extract_samp_stmts(brms_obj$fit) to get the names of Stan parameters & their sampling statements—these don’t generally match up with the brms output.

I’ll work on implementing direct handling of brms objects so that they can be passed directly.

Edit: I’d forgot I’d already implemented this for the main adjust_weights function—you can just pass a brms fit object directly to it. I’ll add this same functionality to the other functions soon.

2 Likes

I’ll definitely try this out for the next analysis!

This looks to be useful, @cmccartan.

I have so far only red the vignette, so I hope this question is not ignorant:

I wonder if one important condition for getting useful weights is that the baseline-model could successfully explore the posterior (no divergences, no problems with BMFI, …). What do you think?

The validity of the original samples is conditional on successful exploration, and since the importance sampling just re-weights the original samples, the validity of the alternative fits is also conditional on there being no problems with the original sampling.

However, even with a well-specified and well-explored original model, if the alternative models are too different, the distribution of the importance weights can still be problematic and lead to unreliable and highly variable estimates (which can be diagnosed in part with the Pareto k statistic).

So successful exploration is necessary but not sufficient.

4 Likes

Thank you for the package,

I tried it with brms negative binomial hurdle model and it flaged out an error when I coded adjust_weights (model$fit):

Error in str2lang(x) : :1:35: unexpected ‘[’
1: Y ~ hurdle_neg_binomial_log_logit([`

Is it available for all classes of brms models?

This is black magic! (in a totally good, morally acceptable way).

One question - will this method work with latent variables more generally? I.e. if you’re using the 8-school examples it works with hierarchical parameters, but am wondering whether latent quantities would interfere with the importance sampling.

Also is there an associated paper?

1 Like

Yes, it should work for all brms models! Did you provide a set of specifications (with make_spec()) to the adjust_weights function too, or just the model$fit? If you have a reproducible example that’s not working, I’m happy to dig into it to figure out what’s going on.

Yes, this will work with latent variables as well, though of course the farther away your alternate specification is from the fit model the less accurate the importance sampling estimates will be.

As far as a paper, this package is just doing plain importance sampling + pareto smoothing on the weights, so the PSIS paper is probably the most relevant citation (Vehtari, A., Simpson, D., Gelman, A., Yao, Y., & Gabry, J. (2015). Pareto smoothed importance sampling. arXiv preprint arXiv:1507.02646). The only thing the package adds to that is the ability to do this automatically with Stan models.

2 Likes

Yes I had seen that earlier paper, this still strikes me as a fairly original application of importance sampling. Perhaps though I just did not read it close enough. In any case, putting it all together in a function is very helpful!

Unfortunately, brms hurdle Poisson model did not work with the package.
My model is:

p1=c(prior_string("student_t(5,0,0.35)", class="Intercept"),
     prior("normal(0,0.2", class="b"),
     prior("student_t(5,0,0.21)", class="sd"),
     prior("logistic(-3,0.03)", class="Intercept", dpar="hu"),
     prior("normal(0,0.03", class="b", dpar="hu"),
     prior("student_t(5,0,0.03)", class="sd", dpar="hu"))

f2=brmsformula(count~ba_sqrt+dbh_log+species.p+species.m+dead_neigbour+y+(1+y|subplot_id)+(1|id.m),
               hu~ba_sqrt+species.p+y+(1+y|subplot_id)+(1|id.m), center=TRUE, decomp="QR")
fit2=brm(f2, data=d, family=hurdle_poisson(link = "log"), save_pars=save_pars(all = TRUE), cores  = 4, iter = 1000 + 4000, warmup = 1000, chains = 4, seed=123,
         sample_prior="yes",prior=p1, silent=TRUE, open_progress=FALSE, control = list(adapt_delta = 0.85, max_treedepth = 10))

extract_samp_stmts(fit2) # or
extract_samp_stmts(fit2$fit)

when I would like to extract the name of parameters, it gave me an error:

Error in str2lang(x) : <text>:1:30: unexpected '['
1: Y ~ hurdle_poisson_log_logit([
                                 ^

How to overcome this?

Seems like something in the brms model specification is throwing it off.

Could you post a reproducible example (i.e, with data & other variables defined) as an issue: Issues · CoryMcCartan/adjustr · GitHub?
If not, sharing the generated brms model code would be helpful.