Having problem with initializing parameters in (Matlab) Stan


#1

Hi folks,

I have a 2D bimodal posterior distribution and I’m trying to sample from this posterior.
(After reading few comments/feedback, I figured out the NUTS/HMC cannot sample from both modes and there is a problem in jumping from one mode to another which can be related to energy barrier… I believe i need to implement the other methods like parallel tempering in this regard OR one empirical solution is manually restarting the generation from other start point, hoping to get stuck at the other minimum)

With all that being said, I’d appreciate it if you could let me know whether I’m doing something wrong based on Stan format or not.

I used 8 chains and i want to have 8 different initial values for each of them.
BUT base on the figures, it looks all of them start from one point…I checked out the warm up samples and it confirms that all starts from similar value.
For example: How can i tell it to be initialized from an exact point like x1 = -2, x2 = 6?
I’d highly appreciate your help.
Here is my Stan and Matlab code:

> data{
> vector[2] mu;
> matrix[2, 2] Sigma;
> }
> parameters {
>   vector[2] x;
> }
> transformed parameters {
>   real g = 5 - x[2] -0.5*(x[1] - 0.1)^2;
> }
> model {
> //prior
>     x ~ multi_normal(mu, Sigma);
> //likelihood  
>      target += normal_lpdf(g | 0, 0.15);
>      if (g > 0)
>      target += negative_infinity();
>      else
>      target += -normal_lcdf(0 | 0, 0.15);
> }

[edit: escaped Stan code]

Matlab code:

%%–Initialize chains with different inits using structs
init(1) = struct(‘x’,7ones(1,2));
init(2) = struct(‘x’,6
ones(1,2));
init(3) = struct(‘x’,5ones(1,2));
init(4) = struct(‘x’,4
ones(1,2));
init(5) = struct(‘x’,[-5 5].*ones(1,2));
init(6) = struct(‘x’,[-2 7].*ones(1,2));
init(7) = struct(‘x’,[-1 5].ones(1,2));
init(8) = struct(‘x’,8
ones(1,2));

params = struct(‘file’,‘TruncBimodal.stan’,‘data’,dat,‘chains’,8,…
‘iter’,10000,‘warmup’,1000,‘algorithm’,‘NUTS’,‘inc_warmup’,…
true,‘verbose’,true);
fit = stan(params,‘chains’,8,‘init’,init);

I attached the plot to show how it looks like.
Thank you in advance for your time and consideration!
Hamed


#2

I’m not sure how to do it in MatlabStan, but you should be able to provide an initialization. Usually a well-formed Stan model doesn’t need to be provided with an initial value—it’s good at finding the typical set itself.

The problem you’re having is probably here:

You don’t want to do this kind of rejection with Stan. Instead, you want every legal value of the parameters according to the constraints to have a finite log density.

What you need to do here is define x with the proper constraints.

parameters {
  real x1;
  real<lower = ?, upper = > x2;
}

where your lower and/or upper bounds need to ensure

5 - x[2] -0.5 * (x[1] - 0.1)^2 > 0

or in other words,

5 - 0.5 * (x[1] - 0.1)^2 > x[2]

so the declaration you need is

parameters {
  real x1;
  real<upper = 5 - 0.5 * (x1 - 0.1)^2> x2;
}

You can then drop the whole conditional—you won’t ever hit the negative infinity case and the LCDF is adding just a constant. So the whole thing can be coded as

g ~ normal(0, 0.15);

To make it efficient, you want to Cholesky factor the covariance matrix in the transformed data block and reuse it in the prior in multi_normal_cholesky.


#3

Dear Dr. Carpenter,

Thank you very much for your comments.
It helped me a lot in better understanding the Stan capability.

Here is my Stan code but i still get the following error:

compile: SYNTAX ERROR, MESSAGE(S) FROM PARSER:
compile: duplicate declaration of variable, name=x; attempt to redeclare as parameter; original declaration as parameter
compile: Problem with declaration.

Stan code:

data{
vector[2] mu;
matrix[2, 2] Sigma;
}
parameters {
real x[1];
real<lower = 5 - 0.5 * (x[1] - 0.1)^2> x[2];
}
transformed parameters {
real g = 5 - x[2] -0.5*(x[1] - 0.1)^2;
}
model {
//prior
x ~ multi_normal(mu, Sigma);
//likelihood
g ~ normal(0, 0.15);
}

I think the error comes from the way parameter “x” is defined. (At this step, i didn’t include the Cholesky factor for simplification)

Regards,
Hamed


#4

This is your error message

and here in your code you have the duplicate declaration of variable with name x
you need to use different names like x1 and x2, and then in transformed parameters you can combined to a vector for the later use in model block


#5

Thank you very much for your complete answer.
Oops, Yes, you are right.
It was a dump question.
Thanks a lot.
So, this is the revised one. Do you mean the following correction?

data{
vector[2] mu;
matrix[2, 2] Sigma;
}
parameters {
real x1;
real<lower = 5 - 0.5 * (x1 - 0.1)^2> x2;
}
transformed parameters {
vector[2] x;
real g = 5 - x2 -0.5*(x1 - 0.1)^2;
x[1] = x1;
x[2] = x2;
}
model {
//prior

x ~ multi_normal(mu, Sigma);

//likelihood

g ~ normal(0, 0.15);

}

Based on the way i coded, Are “x1” and “x2” Multivariate Normal now? (still have lack of familiarity with syntax in Stan)

I’d appreciate your help.
Hamed


#6

Also, i double checked my code again and I have another question in this regard.
Do i need to add the Jacobin transformation for the g?

g ~ normal(0, 0.15);
target += sqrt(0.5)/(2*(sqrt(g - 5 + x2)));

When i add it and then running the code, I get the following error:

Rejecting initial value:
Log probability evaluates to log(0), i.e. negative infinity.
Stan can't start sampling from this initial value.

which means the initial value is not correct.
I’d gratefully appreciate it if you could let me know what my mistake is.
Is my det(Jacobian) correct?

Any help would be appreciated.
Thank you.

Hamed


#7

Is that target increment supposed to be for the Jacobian? If so, it should be the log of the absolute derivative of the transform to g.


#8

Thanks a lot for the point on Jacobian transformation,

Regards,
Hamed