I have 2 models testing a repulsive prior for mixtures. It’s the same model but in 1 I have a user defined pdf in 2 I write it out in the model block.
Here’s the code for 1
data {
int<lower = 0> N;
vector[N] y;
}
parameters {
ordered[2] mu;
real<lower=0> sigma[2];
real<lower=0> tau;
real<lower=0, upper=1> theta;
}
model {
sigma ~ normal(0, 2);
mu ~ normal(0, 2);
theta ~ beta(5, 5);
tau ~ gamma(2, 2);
for (n in 1:N)
target += log_mix(theta,
normal_lpdf(y[n] | mu[1], sigma[1]) + log(1 - exp(- squared_distance(mu[1], mu[2]) * tau)),
normal_lpdf(y[n] | mu[2], sigma[2]) + log(1 - exp(- squared_distance(mu[1], mu[2]) * tau)));
}
and here’s the code for 2
functions {
real potential_lpdf(real y, vector a, real s, real tau, real mu) {
real f = normal_lpdf(y | mu, s);
real r = log(1 - exp(-squared_distance(a[1], a[2]) * tau));
return f + r;
}
}
data {
int<lower = 0> N;
vector[N] y;
}
parameters {
ordered[2] mu;
real<lower=0> sigma[2];
real<lower=0> tau;
real<lower=0, upper=1> theta;
}
model {
sigma ~ normal(0, 2);
mu ~ normal(0, 2);
theta ~ beta(5, 5);
tau ~ gamma(2, 2);
for (n in 1:N)
target += log_mix(theta,
potential_lpdf(y[n] | mu, sigma[1], tau, mu[1]),
potential_lpdf(y[n] | mu, sigma[2], tau, mu[2]));
}
What’s interesting is that the output differ. Maybe this is just due to normal random error but I’m curious as to why when I use the same seed.
Output 1:
3 chains, each with iter=600; warmup=200; thin=1;
post-warmup draws per chain=400, total post-warmup draws=1200.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
mu[1] -0.74 0.02 0.22 -1.22 -0.87 -0.72 -0.58 -0.36 204 1.00
mu[2] 0.83 0.03 0.33 0.24 0.59 0.81 1.05 1.52 174 1.01
sigma[1] 0.97 0.00 0.07 0.82 0.92 0.98 1.03 1.12 265 1.00
sigma[2] 0.98 0.01 0.10 0.77 0.92 0.99 1.05 1.15 213 1.01
tau 3.37 0.05 0.85 2.04 2.79 3.28 3.81 5.18 325 1.01
theta 0.61 0.01 0.16 0.29 0.51 0.63 0.74 0.88 203 1.00
lp__ -1629.54 0.11 1.81 -1634.09 -1630.39 -1629.15 -1628.27 -1627.15 295 1.01
Samples were drawn using NUTS(diag_e) at Wed Sep 02 05:21:36 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
and output 2
Inference for Stan model: mix_betancourt.
3 chains, each with iter=600; warmup=200; thin=1;
post-warmup draws per chain=400, total post-warmup draws=1200.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
mu[1] -0.76 0.02 0.23 -1.30 -0.89 -0.73 -0.60 -0.40 150 1.02
mu[2] 0.79 0.02 0.32 0.17 0.56 0.79 1.00 1.43 182 1.01
sigma[1] 0.97 0.01 0.08 0.78 0.92 0.97 1.02 1.10 167 1.01
sigma[2] 1.00 0.01 0.10 0.80 0.93 1.00 1.06 1.17 265 1.01
tau 3.44 0.04 0.89 2.08 2.82 3.29 3.92 5.56 431 1.01
theta 0.59 0.01 0.16 0.24 0.49 0.61 0.71 0.86 156 1.02
lp__ -1629.70 0.11 1.79 -1634.17 -1630.53 -1629.39 -1628.41 -1627.23 289 1.01
Samples were drawn using NUTS(diag_e) at Wed Sep 02 05:23:06 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
The full call to R
library(rstan)
options(mc.cores = parallel::detectCores())
N <- 1000
mu <- c(-0.75, 0.75);
sigma <- c(1, 1);
lambda <- 0.4
z <- rbinom(N, 1, lambda) + 1;
y <- rnorm(N, mu[z], sigma[z]);
stan_rdump(c("N", "y"), file="mix.data.R")
input_data <- read_rdump("mix.data.R")
singular_fit <- stan(file='..\\mix_betancourt_fun.stan', data=input_data[1:2],
chains = 3, iter = 600, warmup = 200,
control = list(adapt_delta = 0.9),
seed = 483892929)
singular_fit
singular_fit2 <- stan(file='..\\mix_betancourt.stan', data=input_data[1:2],
chains = 3, iter = 600, warmup = 200,
control = list(adapt_delta = 0.9),
seed = 483892929)
singular_fit2