Question about priors in stan

Hi to all!

Im currently working with this code in the context of cosmology. When i put a prior like this (normal) it works fine and throws me the values presents in literature:

functions {
    ...
     real[] GeodesicEquation (real z,          // Redshift
      					      real[] c,          // Comovil Coordinates
                                real[] theta,    // Parameters
                                real[] x_r,      // unused data
                                int[] x_i
    ) {
        real r = c[1];
        
        real OmegaM = theta[1];   
        real OmegaL = theta[2]; 
          
    	real dr_dz = 1/E (z, OmegaM, OmegaL);
    	return {dr_dz};
      }
  }
data {
    int < lower = 0 > N;              // number of measurement 
    real < lower = 0 > zobs[N];       // Redshifts
    real < lower = 0 > mobs[N];      // Supernovaes magnitude
    real < lower = 0 > errorm[N];     // Supernovaes error
  }

parameters {
    // real < lower = 0 > m[N];              // True magnitudes
        **real < lower = 0, **
**  upper = 1 > OmegaM;**         // Hubble local rate
        **real < lower = 0, **
**  upper = 1 > OmegaL;**         // Local density matter
                  real < lower = 0 > sigma;          // Error
            // real < lower = 0 > alpha;          // 
       Evolution parameter
        // real < lower = 0 > beta;           // Evolution parameter
     }

transformed parameters {
     real rr[N, 1] = 
    integrate_ode _rk45 (GeodesicEquation, {0.0}, 0, 
      zobs, {OmegaM, OmegaL},
                               rep_array (0.0, 0), rep_array (0, 0),
                               1 e - 5, 1 e - 3, 5 e2);
  }

model {
    **OmegaM ~ normal (0.5, 0.5);**
**    OmegaL ~ normal (0.5, 0.5);**
    sigma ~ lognormal (-1, 1);
    for (i in 1 : N) {
    	  if ((1 - OmegaM - OmegaL) < 0) {
                    
        mobs[i] ~ 
           normal (5*
             log10 ((1 + zobs[i])*
               sin (sqrt (-(1 - OmegaM - OmegaL))*rr[i, 1])/
                sqrt (-(1 - OmegaM - OmegaL))) + Ev (zobs[i]), 
           sigma); }
       	  if ((1 - OmegaM - OmegaL) == 0) {
                    
        mobs[i] ~ 
           normal (5*log10 ((1 + zobs[i])*rr[i, 1]) + Ev (zobs[i]), 
           sigma); }
             if ((1 - OmegaM - OmegaL) > 0) {
                    
        mobs[i] ~ 
           normal (5*
             log10 ((1 + zobs[i])*
               sinh (sqrt (1 - OmegaM - OmegaL)*rr[i, 1])/
                sqrt (1 - OmegaM - OmegaL)) + Ev (zobs[i]), sigma); }
            // mobs[i] ~ normal (m[i], errorm[i]);
    }
  } 

But when i comment the line OmegaM ~ normal (0.5, 0.5);
** OmegaL ~ normal (0.5, 0.5);** it gives me weird thinks. Why it happens? Cause i learnt that if i define a lower and upper value in the parameter declaration, i define the prior to be uniform. There is a problem with uniform prior in HMC?

Could you be more specific what you mean by “weird”? Do you get errors? Or do the estimates change a lot? Something else?

For the case that you say worked as expected, we’re the bounds on OmegaM also used? Because if yes, then
I agree removing the additional prior should not matter much as the difference between a uniform prior on 0,1 and normal(0.5, 0.5) truncated to 0,1 shouldn’t be a big deal in a well-behaved model.

That’s correct. Uniform priors can cause problems for HMC only if the resulting posterior is not well behaved (in which case it is almost always also a problem for any other algorithm)

Best of luck with your model!

Hi Martin!

Thanks for your response. When i said “weird things” it means that throws me completely different results than the “correct” ones. I get the correct ones when i put the aditional prior normal(0.5,0.5), if i dont and just put the lower and upper limit it doesnt work.

How can i see if mi posterior is well behaved?

Thanks again

1 Like

OK, this looks like a tough problem.

One way is to use visualisations: in R you can use mcmc_pairs from the bayesplot package, Shinystan and pairs from rstan . The main idea is that well behaved posterior looks mostly like bunch of independent normal distributions for all parameters, any deviations signal potential problems. Further reading:

But I think you are going to run into the problem that your model is quite big and will be hard to debug. One way to simplify it would be to split the ODE part and the rest of the model, i.e. build a model where you observe noisy realisations of rr (e.g. y ~ normal(rr[,1], 1) and fit it to simulated data. Then you can build another model that gets rr as data (i.e. treats it as known) and implements the rest of your model block. There might be other ways to break the model down, the point is that you try to work with much smaller pieces of code at once and use simulated data to have datasets that match the code exactly.

In fact, there is one suspicious elements of the model: the branch if ((1 - OmegaM - OmegaL) < 0) can introduce discontinuities in your posterior and/or its derivative - I don’t understand the math in your model enough to be able to tell if this is actually the case (maybe the two branches are joined smoothly?), but that could definitely introduce problems. One way to make sure this is not an issue would be to fit a restricted model, where you constraint OmegaM and OmegaL by definition to only match one of the branches (e.g. by having real<lower=0, upper = 1> Omega_L_raw; and then transformed parameter OmegaL = Omega_L_raw * ( 1 - OmegaM).
If you are able to fit each branch separately, there are then methods to combine them without introducing discontinuities.

Hope that makes at least some sense :-)

2 Likes

Hi Martin, sorry for my last response but i’ve been away from myu work and i’m now recovering.

The last part of your comment make a lot sense to me, because in fact the separations between the “if” conditional statement is not that smotth. Just to draw a simple picture, it has to be with the curvature of the space-time in which im working, and it changes due to different values of the OmegaM (matter) and OmegaL (Dark energy) values, so that the sum of curvature and those it has to be one (in fact the 1-OmegaM-OmegaL parameter is the curvature energy, changing to spherical, plane or hyperbolical).

What you said at the end make a lot sense to me but i need more information to implement it. Do you say something like constraint OmegaM+OmegaL=0,>0 and <0??? Can you give me more information how to do that and how can i match them??? It looks very interesting

Cheers and Thanks again!

The simplest case if 1 - OmegaM - OmegaL == 0 - in this setting, you have a model with one less parameter, e.g. if you take OmegaM as a parameter, you can then have OmegaL = 1 - OmegaM (in transformed parameters).

The case 1 - OmegaM - OmegaL < 0 can be handled in multiple ways. I already mentioned one of those above

Alternatively, you could also have:

real<lower=0, upper = 1> OmegaM;
real<lower= 1 - OmegaM, upper = 1> OmegaL; 

or even have:

parameters {
  real<lower=1, upper=2> OmegaSum; // OmegaM + OmegaL
  real<lower=0, upper=1> OmegaSplit; // How the OmegaSum is split between M and L
}

transformed parameters {
  real<lower=0, upper =1> MinOmega = OmegaSum - 1; //Both omegas have to be larger than MinOmega
  real<lower=0, upper=1> OmegaM =  (OmegaSum - MinOmega) * OmegaSplit;
  real<lower=0, upper=1> OmegaL = OmegaSum - OmegaM; //Equivalent but more efficient than   (OmegaSum - MinOmega) * (1 - OmegaSplit)
}

Where the last version has the property that the implied prior on OmegaM and OmegaL is symmetric - whether this is desirable obviously depends on the domain knowledge, but if OmegaM + OmegaL is an interpretable quantity (which it looks like it is), then it might work well. Hope I didn’t mess up my computations, please double check that all the versions imply the inequality and don’t leave satisfying values out.

All the versions will imply different geometry and could thus lead to different sampling behavior/performance. The goal is to find a parametrization where the posterior of the raw untransformed parameters is mostly uncorrelated.

The third conditional branch is mostly symmetric/analogoues and is left as an exercise for the reader :-D (but feel free to ask for clarifications if the above seems confusing).