Stan model such as one below compiles and runs but it doesn’t produce the right density for the single parameter.
transformed data{
vector[1000] x;
for (i in 1:1000) {
x[i] = normal_rng(0, 1);
}
}
parameters{
real y;
}
model {
for (i in 1:1000) {
if(x[i] > 0) {
y ~ normal(10, 2);
} else {
y ~ gamma(3, 3);
}
}
}
The problem seems to be that you specify that y only has one dimension (real y), thus when you run the the model with multiple iterations, you’ll get the sampling distribution of the singe value of y. The following code will replicate your undesired figure:
The question of being able to estimate a parameter with a mixture density is quite different from generate mixed-density data. Your code is doing the latter.
I am not sure about what I am going to write, so take everything I say with a grain of salt.
I think what you are trying to achieve might not be possible in this stylized set-up. Because, you have 1000 values for x, and only 1 y, you are effectively averaging over the two distributions. You can think of it as: for x[1], y is drawn from the normal, for x[2], y is drawn (I know this is not how HMC/Stan works) from the gamma, and so forth. So the best guess for y is some kind of average over the normal(10,2) and the gamma(3,3) which gives you the Rplot.pdf.
If you want to have the nice separation of the distribution, you need to have the same amount of x's and ys. Just like @tiagocc wrote. Then for x[1], y[1] is always drawn from the normal, for x[2], y[2] is always drawn from the gamma, … As a result half of the y's will come from the normal and half will come from the gamma and you get the separate distributions in the histograms. The original stan model basically estimated the mean of the vector y with N elements.
In the spirit of what @stijn said, I am not totally sure this is the correct approach to your problem, but if the goal is to estimate a single parameter, I guess the solution would be to sample it from a finite mixture between the gamma and normal distributions you want, with a mixing proportion lambda
transformed data {
real lambda = .5;
}
parameters{
real<lower = 0> y;
}
model {
target += log_mix(lambda,
normal_lpdf(y | 10, 2),
gamma_lpdf(y | 3, 3));
}
I define lambda as 0.5, since the cdf of the standard normal on 0 is \Phi(0) = 0.5, but you can still generate the samples from the standard normal and compute the ecdf at 0 on the transformed data block.
If you plot the posterior samples of y you’ll get something close to what I think you want.