# Composing Stan models (posterior as next prior)

Thanks @maxbiostat ! I’ll roll that paper and the code into our working group to study.

@Ara_Winter - Side note, here’s a potentially relevant paper: https://osf.io/jrjd2/ - “Cumulative science via Bayesian posterior passing, an introduction.”

1 Like

You set a_0, however, you could also estimate it. In your experience, what is typically done?

You may also want to follow Omega.jl where composability is a main feature (i.e. condition on predicates). It looks to be quite experimental and the author recommends using Stan if your model is expressible in Stan.

Typically, a_0 is fixed. But you’re absolutely right it can be estimated. You just need to be careful about the normalising constant: c(a_0) = \int_{\Theta} L(D \mid \theta)^{a_0}\pi(\theta)d\theta.

See this note for discussion and examples. The paper I linked above also has a section with discussion on fixed versus random a_0.

1 Like

Note that the power prior approach also suffers from this information leakage. For a few models, we know how to set a_0 such that some “information-optimality” is satistified (see this paper).

Hi,
I am having a similar issue.
I am estimating the matrix A in a model where y = Ax + e with e an error term, and x an y given vectors (len(x)<len(y)). I have several (x;y) couples and would like to iterate the estimation on each couple using as prior of the i+1-th iteration the posterior of the i-eth iteration. If anyone has an idea I would very much appriciate.

My Stan model writes as follows for the first iteration:

  data{
vector y; //NACE Rev 2 vector
vector x; //NACE Rev 1 vector
//A priori dirichlet parameters
vector c01;
vector c02;
vector c03;
vector c04;
vector c05;
vector c07;
vector c08;
vector c09;
//Likelihood parameter (the 1 left out of the estimation)
real a06;

}

parameters{
simplex a1;
simplex a2;
simplex a3;
simplex a4;
simplex a5;
simplex a7;
simplex a8;
simplex a9;

}

model{
a1~dirichlet(c01);
a2~dirichlet(c02);
a3~dirichlet(c03);
a4~dirichlet(c04);
a5~dirichlet(c05);
a7~dirichlet(c07);
a8~dirichlet(c08);
a9~dirichlet(c09);

y ~normal(a1*x,0.1);
y ~normal(a1*x+a2*x+a5*x+a8*x+a9*x,0.1);
y~normal(a3*x,0.1);
y ~normal(a1*x+a2*x+a3*x+a9*x,0.1);
y ~normal(a2*x+a8*x,0.1);
y ~normal(a5*x,0.1);
y ~normal(a1*x+a5*x+a7*x,0.1);
y ~normal(a06*x,0.1);
y ~normal(a2*x+a7*x+a8*x+a9*x,0.1);
y ~normal(a8*x,0.1);
y ~normal(a7*x+a8*x+a9*x,0.1);
y ~normal(a4*x+a7*x+a8*x+a9*x,0.1);
y ~normal(a1*x+a2*x+a5*x+a7*x+a8*x+a9*x,0.1);
}

Hi!
Just to pick this discussion up, I am trying to reproduce the power prior with ‘random’ a0 for a simple regression example. It works with the unnormalized version. I am unsure, however, how to include the normalizing constant c(a0) as mentioned by max. Currently, I am thinking along the lines of specifying the power prior as in max’s code, using the posterior distribution with the bridgesampling-package to calculate the marginal likelihood, and using this in the regression setup.

Perhaps somebody has another idea or already implemented the normalized power prior in Stan?

Any help/suggestions appreciated!

Thanks,
Chris

1 Like

Depending on the model you might be able to write c(a_0) in closed form. Otherwise, bridge sampling is a good way to go. I have.a manuscript in the works that deals with this. PM me if you are interested.

Hi max,
yes, I’d be interested! It seems I cannot send you a PM (the composer throws an error, ‘you cannot send a PM to that user’), so I reply here.

My first example would be a simple regression model, but later on it will become more complicated (mostly SEMs). I’ve attached a small example where I used your code with the bridgesampling package. Perhaps you could have a look if I am on the right track…

Thank you very much!
Best,
Chris
test_normalizedPP.R (2.2 KB)

1 Like

Cool. I may move this conversation to email, since this is an ongoing project with collaborators and I’m not sure I can publicise the details yet. But one thing to notice is that

data{
...
real<upper=0> C;
}
model{
target += a0 * normal_lpdf(y0 | X0*beta, sigma ) / C;
...
}


won’t work because C depends on a_0. What you need is a function that approximates c(a_0) which is the topic I’ve been working on.

Ah, I see! I knew there was a catch :)
Sure, you can reach me at koenig.chris@gmail.com .

Looking forward to hearing from you!

Best
Chris

I have related question. I apologise in adavance if this might be based on misconceptions but is it possible to just use the posterior mean and SD and use them as the mean and SD for a normal prior or other fitting a student’s t distribution and use those for priors? If so, is there a different between regression models with random effects (e.g. subjects), with more than one predictor (e.g. x and x*x or x and z)? Do you need to extract more than just the fixed effects to make this work? This obviously assumes that the respective posterior distributions are well captured by the normal or student’s distribution.

1 Like

Hi JAQent,

yes it is. For the scale, however, I would use a measure of precision of the estimate in question. If there are more than one predictors, it boils down to the question of similarity between the previous and current study.

Best
Chris

1 Like

Picking up on this again, I’m wondering whether anyone has experience in implementing the Power prior for mixed effects models, where both the current and historical data are modelled using hierarchical models.The method is outlined in section 4 of https://projecteuclid.org/download/pdf_1/euclid.ss/100921267 - it involves specifying a power prior which is the integral of the historical data likelihood (multiplied by the density of the random effects) with respect to the random effects parameters, which I can’t figure out how to implement in Stan.

Any thoughts would be much appreciated
Ben

1 Like

The link seems broken. I’m up for doing it, if you can open another thread just to keep things on topic.

1 Like

Thank you - Power Prior for a Hierarchical Model in Stan

Hi everyone,
@bgoodri where do I can find more references for simplex and constrained multivariate distribution?
I am interested in talking in one polytope class that I have.

Thanks