# Hierarchical Bayesian model and constrained optimization in RStan?!

Hi all,
I have a complex pdf based on hierarchical Bayesian formalism where x depends on the hyperpriors w’ and w’’ as w=Gamma(beta/2,2/zeta) where Gamma is the gamma distribution with parameters beta and zeta. The resulting negative log likelihood for joint posterior pdf is in the attached picture. I want to sample both of w’, w’’, and x. My approach is to use the Hamiltonian Monte Carlo sampler. I have the following questions below:

1. I thought before that I can split the sampling process that, I sample w’ and w’’ from the gamma distribution first. And as they become known values, they can be taken out from the exponential as normalizing constants leaving only x to be sampled. As I have the constraint x>=0, this makes the posterior truncated mulitivariate Gaussian. Consequently, I thought of sampling x using the exact algorithm by Pakman and Paninski. Is this approach feasible?

2. I want to use the HMC algorithm by Stan, however I want to know if the optimizer in Stan to estimate the MAP supports the x>=0 constraint or it is for x>0 only?

3. Can you refer me to a good RStan example to train myself who to sample a complex hierarchical posteriors like the one attached?

Thanks and best regards

Notes:
Exact HMC algorithm: https://arxiv.org/abs/1208.4118
lambda, beta, zeta are given input parameters I think this will be fine coded directly in Stan (check out the manual for examples of hierarchical models and make sure to try the non-centered parameterization): https://github.com/stan-dev/stan/releases/download/v2.15.0/stan-reference-2.15.0.pdf

1 Like

If those constraints are just bounds on individual parameters, I think you’re good and just code the model up in Stan and see what happens.

But I’m not sure those quadratic/multivariate constraints in the paper are going to be okay. Stan doesn’t do multivariate constraints automatically, so if you want to have those in there you’ll need to do some parameter transforms yourself and include any necessary Jacobian adjustments.

Is there a way to achieve the effect of your constraints some other way?

Ideally in Stan we just code up our models (write down our likelihoods) and let NUTS handle the rest.

1 Like

Another option is to incorporate nonlinear equality constraints as a soft constraint through a declared probability on the observed value of some function, if f(a,b,c,d) = 0 is your logical constraint, then outside regions where a,b,c,d lead to f(a,b,c,d) ~ 0 the probability associated with that a,b,c,d vector should be vastly downweighted.

This is exactly what you get when you take a prior p(a,b,c,d) and multiply it by a likelihood p( data | a,b,c,d) for observed data. Instead of observed data you can use calculated “data” f(a,b,c,d) and then downweight those regions of a,b,c,d space where the constraint isn’t approximately met.

f(a,b,c,d) ~ normal(0,0.001)

There are cases where this can be effective. Stan will complain about nonlinear transform on the left hand side, but in this case you are intending to induce a distortion on the a,b,c,d space, and there is no Jacobian correction (no such correction is even possible as f is a univariate function of a multivariate input, the determinant of the jacobian matrix is zero).

Use with caution, it can induce strange nonlinear correlations (in fact, that’s the point) but those can themselves be hard to sample. Nevertheless, it can be hugely useful and in my experience Stan often does fine with it.

I’d suggest just writing the density directly and not worrying about point estimation with MAP (which rarely works for hierarchical models, which is why you need special approaches like lme4 or INLA).

If you have constrained parameters, you ned to declare them.

The RStan vignettes are good to learn how to use RStan. The manual for Stan covers a range of regression models and hierarchical versions. There are also a couple case studies on hierarchical modeling on our web site under documentation.

We don’t support the Pakman and Paninski approach in Stan.