Defining an "implicit" prior on an "implicit" manifold in rstan

Hello, I am an rstan novice, so I apologize for my lack of background. I’ve looked over the documentation and this topics posted here, and I can’t seem to find answers to my questions. I’m sorry if I’ve missed the appropriate documentation.

I am interested in defining a prior and a likelihood function for a Bayesian MCMC analysis using rstan. At the core of the likelihood function is a sum of two dependent, scaled beta distributed random variables, the beta distributions are themselves parameterized in terms of their mean and variance (rather than the typical shape1 and shape2 parameters), and the mean and variance of each beta distribution are then defined in terms of polynomials that are functions of the observed data. The polynomial coefficients are the core parameters of the model.

I have a good sense of what the beta mean and variance values should be based on past experience. So, I’ve set up prior distributions on the mean and variance of the beta distributions. I’m currently using uniform priors with specified min and max, but they don’t need to be uniform. I’ve also set Boolean constraints (indicators) so that (1) the beta var < mean(1-mean), (2) the corresponding beta shape parameters are positive, and (3) the means and variances are within the bounds set by the scaled beta distributions. When the constraints are violated, the prior is 0.

I know nothing else about the coefficients, so there is no explicit prior defined for them. (Well, I suppose it is an improper uniform.) Clearly, however, given the prior on the beta distributions’ means and variances that are defined in terms of the polynomial’s coefficients, there is an “implicit” joint prior distribution associated with the coefficients.

I’ve written a home-spun R program that implements the above likelihood and prior distributions, and I’ve coded a simple random-walk MCMC to sample from the posterior. The problem is that the MCMC is very inefficient (high autocorrelation, slow run time due to numerically evaluating a quadruple integral in the likelihood function). Based on estimates of ESS’es using short MCMC runs, the runtime of my MCMC will be 2-3 weeks.

So, I am looking to implement this model using a more powerful package, such as rstan, whose methods might reduce the autocorrelation, improve mixing, and reduce overall runtime.

More formally, if the “response” observations are Y, the independent variables are X, the beta parameters are B, the polynomials’ coefficients are C, and the polynomials are functions of X and C (e.g., for first degree polynomial, an intercept + a dot product of X and C). Based on the above description, the set up I’m using is a fairly standard hierarchical model:

Likelihood: Y | B,C,X ~ P(Y|B,C,X) function of sum of two dependent, scaled beta distributions
Prior “stage II”: B | C,X ~ P(B | C, X) uniform priors on the betas’ means and variances and feasibility constraints
Prior “stage III”: C | X ~ P(C|X) improper uniform (implicit)
Prior “stage IV”: X ~ P(X) improper uniform (implicit, but the independent variables are themselves correlated and might best be modeled using a multivariate normal)

Applying Bayes rule and requiring the improper uniforms to “cancel out” in the MCMC acceptance ratio results in the posterior:

Posterior: P(B,C,X | Y) is proportional to P(Y|B,C,X) * P(B | C, X)

So, here are my questions:

  1. How would one implement the above model in rstan?

Where I’m running into trouble is how to handle the “implicit” priors on C and X, the independent data definition, and the constraints (indicator variables) in the model. Any suggestions or tips about rstan model specification would be greatly appreciated!

  1. Would you recommend using explicit priors for C and X, and if so, which functional form should they take?

Thank you very much for your help!

Sorry it has taken us so long to answer. These long modeling questions don’t get a lot of love as they require a long time to understand and answer.

I’d recommend cmdstanr or cmdstanpy—they’re both up to date with current versions of Stan and easier to install than rstan or pystan.

This is tricky as there’s only a limited range of variances possible for a beta distribution given its mean and you have to make sure you’re respecting that constraint. There’s a discussion in the user’s guide about parameterizing beta(a, b) in terms of mean (a / (a + b)) and total scale (a + b), which is much better behaved than the dependence you get in the ordinary parameterization.

This is particularly tricky in constrained situations where you’re likely to get probability mass bumping up against the boundaries, which might not have the effect you intend—it will bias the posterior means away from the boundaries in a stronger way than if the boundaries were wider.

If you try this in HMC, what happens is that it will degenerate to rejection sampling if the constraints get violated. Instead, you want to roll them into constraints on the variance or on the mean.

I didn’t understand the model description well enough to give you a more specific answer. The general answer is always that you have to code the likelihood and prior in the model block and code any constraints on the parameter declaration or with explicit transforms.

If you have specific questions about how to do that, those kinds of questions get answered quickly.