Mixed-Density Parameter

Vishv_Jeet · November 20, 2018, 7:28pm

Stan model such as one below compiles and runs but it doesn’t produce the right density for the single parameter.


transformed data{
  vector[1000] x;
  for (i in 1:1000) {
    x[i] = normal_rng(0, 1);
  }
}
parameters{
  real y;
}
model {
  for (i in 1:1000) {
    if(x[i] > 0) {
      y ~ normal(10, 2);
    } else {
      y ~ gamma(3, 3);
    }
  }
}

Vishv_Jeet · November 20, 2018, 7:36pm

Here is the density plot from Stan sampling.
Rplot.pdf (8.8 KB)

This is what is expected.
Rplot01.pdf (10.4 KB)

tiagocc · November 20, 2018, 9:10pm

The problem seems to be that you specify that y only has one dimension (real y), thus when you run the the model with multiple iterations, you’ll get the sampling distribution of the singe value of y. The following code will replicate your undesired figure:

library(rstan)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)
library(tidyverse)

mJeet <- stan_model("path_to_model/mJeet.stan")

fit <- sampling(mJeet,
                refresh = 1e5)

as.data.frame(fit, pars = "y") %>% 
  ggplot(aes(x = y)) + 
  geom_histogram(bins = 80)

undesired
By moving everything to the generating quantities block and correctly specifying the dimensions of y, you can obtain your desired distribution:


data {
  int N; // sample size
}

generated quantities{
  vector[N] x;
  vector[N] y;
  
  
for (i in 1:N) {
  x[i] = normal_rng(0,1);

    if(x[i] > 0) {
      y[i] = normal_rng(10,2);
    } else {
      y[i] = gamma_rng(3, 3);
    }
  }  
}

mDiscourse <- stan_model("path_to_model/mDiscourse.stan")

fit <- sampling(mDiscourse,
                data = list(N = 1e3), 
                iter = 1,
                chains= 1,
                algorithm = "Fixed_param")

as.data.frame(fit, pars = "y") %>% 
  gather %>% 
  ggplot(aes(x = value)) + 
  geom_histogram(bins = 80)

Vishv_Jeet · November 21, 2018, 2:31pm

The question of being able to estimate a parameter with a mixture density is quite different from generate mixed-density data. Your code is doing the latter.

tiagocc · November 21, 2018, 11:32pm

Sorry for the misunderstanding then… If you want to estimate y then you need to change two things in your model:

declare the parameter dimensions and set it to be higher then 0: real<lower = 0> y;
add to index on both y’s on the for loop in the model block.

data{
int N;
}

transformed data{
  vector[N] x;
  for (i in 1:N) {
    x[i] = normal_rng(0, 1);
  }
}
parameters{
  real<lower =0> y[N];
}
model {
  for (i in 1:N) {
    if(x[i] > 0) {
      y[i] ~ normal(10, 2);
    } else {
      y[i] ~ gamma(3, 3);
    }
  }
}

After fitting the model and looking at one of the sampled data sets, everything seems to in order.


dstan <- list(N=1e3)

fitJeet <- sampling(mJeet_fix, 
                    data = dstan, 
                    chain = 1)

dpred <- as.matrix(fitJeet, pars = "y") %>% 
 t() %>% as.data.frame()

colnames(dpred) <- 1:dstan$N

dpred %>% gather(rep, value, 1:dstan$N) %>%
  filter(rep == 1) %>% 
  ggplot(aes(x=value)) +
  geom_histogram(bins = 100)

desired

Vishv_Jeet · November 27, 2018, 1:11am

Not quite there yet. The above histogram isn’t the posterior density of single parameter.

stijn · November 27, 2018, 8:52am

I am not sure about what I am going to write, so take everything I say with a grain of salt.

I think what you are trying to achieve might not be possible in this stylized set-up. Because, you have 1000 values for x, and only 1 y, you are effectively averaging over the two distributions. You can think of it as: for x[1], y is drawn from the normal, for x[2], y is drawn (I know this is not how HMC/Stan works) from the gamma, and so forth. So the best guess for y is some kind of average over the normal(10,2) and the gamma(3,3) which gives you the Rplot.pdf.

If you want to have the nice separation of the distribution, you need to have the same amount of x's and ys. Just like @tiagocc wrote. Then for x[1], y[1] is always drawn from the normal, for x[2], y[2] is always drawn from the gamma, … As a result half of the y's will come from the normal and half will come from the gamma and you get the separate distributions in the histograms. The original stan model basically estimated the mean of the vector y with N elements.

tiagocc · November 27, 2018, 1:47pm

In the spirit of what @stijn said, I am not totally sure this is the correct approach to your problem, but if the goal is to estimate a single parameter, I guess the solution would be to sample it from a finite mixture between the gamma and normal distributions you want, with a mixing proportion lambda

transformed data {
 real lambda = .5;  
}

parameters{
  real<lower = 0> y;
}

model {
  target += log_mix(lambda, 
                    normal_lpdf(y | 10, 2),
                    gamma_lpdf(y | 3, 3));
}

I define lambda as 0.5, since the cdf of the standard normal on 0 is \Phi(0) = 0.5, but you can still generate the samples from the standard normal and compute the ecdf at 0 on the transformed data block.

If you plot the posterior samples of y you’ll get something close to what I think you want.

The R code bellow will replicate the figture:

library(rstan)
options(mc.cores = parallel::detectCores())

fitJeet <- sampling(mix_jeet,
                    chains = 4,
                    iter = 10000, 
                    warmup = 2000,
                    control = list(adapt_delta = .99), 
                    seed = 132)

dfit <- as.data.frame(fitJeet, pars = "y")

library(ggplot2) 
ggplot(dfit, aes(x = y))+
  geom_histogram(bins = 100)

Vishv_Jeet · November 27, 2018, 2:36pm

This is great, finally the question is settled.

Topic		Replies	Views
Estimating parameters of two normal distribution, mixed at proportions Modeling	5	1341	May 10, 2019
Sampling of mixed distribution Modeling	3	581	June 6, 2017
Sampling from this banana shaped distribution Modeling	10	4111	March 27, 2018
Gaussian Process Prediction Modeling rstan , techniques , fitting-issues	2	470	March 22, 2021
How to realize the R code in Stan Modeling rstan	3	742	July 31, 2021

Mixed-Density Parameter

Related Topics