Bimodal prior (info from two previously published models)

Hello, I have the following (test) data.

stan_data <- list(
  N = 25,
  Y = c(0.8333333,0.6000000,0.6000000,1.5000000,1.4166667,0.0500000,0.5500000,2.0000000,
  x = c(12.9000,5.6000,1.0000,0.9000,33.7500,11.5000,18.1500,15.3500,13.6500,18.0000,16.2500,

There are two published models for similar data with the non-linear form y = exp(b0 + b1 * sqrt(x) + b2 * x).

Say that model 1 shows a coefficients for b0 as Normal(-10,2) and model 2 as Normal (-12,1). My questions are:

  1. Is it correct to use both priors and give them the same probability? Should I take into account the sample size of those studies versus mine? (theoretical)
  2. How do you actually include a bimodal prior? (technical)

Here is the code for my new model including only one study as prior.

data {
  int<lower=0> N; 
  real x[N]; 
  real<lower=0> Y[N]; 
parameters {
  real b0; 
  real b1;  
  real b2;  
  real<lower=0> beta; 
transformed parameters {
  real m[N];
  for (i in 1:N) 
    m[i] = exp(b0 +b1*sqrt(x[i]) +b2*x[i]);
model {
  // priors taken only from published model 1
  b0 ~ normal(-10, 2);  
  b1 ~ normal(0.2, .5); 
  b2 ~ normal(-.05, .02); 
  // likelihood
  Y ~ gamma(m, beta);   

Here I found an example for plotting bimodal priors but I do not know how to integrate it with my model.

 data { }
 parameters {
   real mu;
 transformed parameters { }
 model {
   target += log_sum_exp(normal_lpdf(mu|-10,2),normal_lpdf(mu|-12,1));
1 Like

It may or may not be. Sample size isn’t the only factor affecting the quality of the estimate; it could be more appropriate to take the sample size into consideration when considering the variance of the prior for each of these estimates. There are many assumptions you can make about results like these, and how you incorporate them into your priors will depend on what can be justified as objectively as possible.

That said, and this may be a personal philosophical opinion others may disagree with, I don’t think there is any justification for including a bimodal prior (and not just because it could cause technical issues in practice). From the point of view of a model structure, bimodality or nonidentifiability do not represent a parameter taking on mutually exclusive sets of values, but our inability to identify the "correct "combination.
It is certainly possible that there is a kind of parameter contributing in two different-but-related ways to the model output, but I’d argue that this would be better described as a random effect or a hierarchical structure – this could also explain the bimodality of the prior knowledge: these are most likely individual-level instances of a group-level parameter. In sum, there are many reasons why independent estimates will have different estimates, but I cannot think of a scenario where a parameter distribution should be assumed to be bimodal.

If you decide to do this, you could probably write a function for a custom distribution and used that in the model block. I’m not sure you can use something more “elegant” like theta ~ 0.5*normal(0,1) + 0.5*(2,1), but then again I’m not sure when that would be needed.

1 Like

Thanks this is a nice explanation. I think I will create a single prior as mean of the two previous models and see how it goes with that.