How to use prior from a sample?

Hi,

I want to fit with 20 parameters. For 5 of them I want to use a prior drawn from a 5D sample (I got this from a previous fit). I emphasize that my priors are not simple analytical functions, rather I have this 5D samples that has the shape say (5 x 10000). How to do this?

Can I simply use something like the following?

data {
//some data
matrix[5, 10000] prior; 
}

parameters {
real theta[20];
}
model {
  int chosen_index;
  chosen_index=uniform(0,10000);
  for (i in 1:6) {
    theta[i] = prior[i,chosen_index]
//or it should be theta[i] ~ prior[i,:] ?
  }
for (i in 6:21) {
theta[i] ~ uniform(-100,100) //uniform prior on the rest of the parameters
}

  // Likelihood model for your observed data

}

Thanks in advance.

You need to come up with a 5-dimensional parametric function that is approximately proportional to the prior implied by your samples.

Alternatively, you could try to write down a model that fits your previous model (that yielded your samples) and your current model jointly, in effect substituting the previous model itself in place of the parametric function.

I want to go with the first option, the 2nd option would be too difficult. Can I obtain the joint PDF estimation using say KDE within STAN?

I don’t necessarily recommend this approach, but if you really want you can pass all of the prior samples as data, write a function to compute the density based on your chosen kernel at an arbitrary point in parameter space, and then increment the target by the logarithm of this density.

If a multi variate normal mixture is ok for you (which should really work), then you can use the mixstanvar functionality from RBesT to have a previously fitted multivariate normal mixture applied to your sample as a prior in a brms model… see

1 Like

It should be consistent with Bayesian updating if you just add the sample you want to use as a prior to your dataset? It’s probably not the same in terms of how stan handles it in the implementation, but the posterior should be the same I think.

Edit I misread your innitial post.
You can generate new samples based on your parameter samples from the posterior predictive distribution and add those to your dataset. I am not entirely sure how large the sample would have to be though.