# Why do model estimates change with udf vs non-udf using the same seed?

I have 2 models testing a repulsive prior for mixtures. It’s the same model but in 1 I have a user defined pdf in 2 I write it out in the model block.

Here’s the code for 1

``````data {
int<lower = 0> N;
vector[N] y;
}

parameters {
ordered mu;
real<lower=0> sigma;
real<lower=0> tau;
real<lower=0, upper=1> theta;
}

model {
sigma ~ normal(0, 2);
mu ~ normal(0, 2);
theta ~ beta(5, 5);
tau ~ gamma(2, 2);

for (n in 1:N)
target += log_mix(theta,
normal_lpdf(y[n] | mu, sigma) + log(1 - exp(- squared_distance(mu, mu) * tau)),
normal_lpdf(y[n] | mu, sigma) + log(1 - exp(- squared_distance(mu, mu) * tau)));
}
``````

and here’s the code for 2

``````functions {
real potential_lpdf(real y, vector a, real s, real tau, real mu) {
real f = normal_lpdf(y | mu, s);
real r = log(1 - exp(-squared_distance(a, a) * tau));

return f + r;
}
}
data {
int<lower = 0> N;
vector[N] y;
}

parameters {
ordered mu;
real<lower=0> sigma;
real<lower=0> tau;
real<lower=0, upper=1> theta;
}

model {
sigma ~ normal(0, 2);
mu ~ normal(0, 2);
theta ~ beta(5, 5);
tau ~ gamma(2, 2);

for (n in 1:N)
target += log_mix(theta,
potential_lpdf(y[n] | mu, sigma, tau, mu),
potential_lpdf(y[n] | mu, sigma, tau, mu));
}
``````

What’s interesting is that the output differ. Maybe this is just due to normal random error but I’m curious as to why when I use the same seed.

Output 1:

``````3 chains, each with iter=600; warmup=200; thin=1;
post-warmup draws per chain=400, total post-warmup draws=1200.

mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
mu       -0.74    0.02 0.22    -1.22    -0.87    -0.72    -0.58    -0.36   204 1.00
mu        0.83    0.03 0.33     0.24     0.59     0.81     1.05     1.52   174 1.01
sigma     0.97    0.00 0.07     0.82     0.92     0.98     1.03     1.12   265 1.00
sigma     0.98    0.01 0.10     0.77     0.92     0.99     1.05     1.15   213 1.01
tau          3.37    0.05 0.85     2.04     2.79     3.28     3.81     5.18   325 1.01
theta        0.61    0.01 0.16     0.29     0.51     0.63     0.74     0.88   203 1.00
lp__     -1629.54    0.11 1.81 -1634.09 -1630.39 -1629.15 -1628.27 -1627.15   295 1.01

Samples were drawn using NUTS(diag_e) at Wed Sep 02 05:21:36 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
``````

and output 2

``````Inference for Stan model: mix_betancourt.
3 chains, each with iter=600; warmup=200; thin=1;
post-warmup draws per chain=400, total post-warmup draws=1200.

mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
mu       -0.76    0.02 0.23    -1.30    -0.89    -0.73    -0.60    -0.40   150 1.02
mu        0.79    0.02 0.32     0.17     0.56     0.79     1.00     1.43   182 1.01
sigma     0.97    0.01 0.08     0.78     0.92     0.97     1.02     1.10   167 1.01
sigma     1.00    0.01 0.10     0.80     0.93     1.00     1.06     1.17   265 1.01
tau          3.44    0.04 0.89     2.08     2.82     3.29     3.92     5.56   431 1.01
theta        0.59    0.01 0.16     0.24     0.49     0.61     0.71     0.86   156 1.02
lp__     -1629.70    0.11 1.79 -1634.17 -1630.53 -1629.39 -1628.41 -1627.23   289 1.01

Samples were drawn using NUTS(diag_e) at Wed Sep 02 05:23:06 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
``````

The full call to R

``````library(rstan)
options(mc.cores = parallel::detectCores())
N <- 1000
mu <- c(-0.75, 0.75);
sigma <- c(1, 1);
lambda <- 0.4
z <- rbinom(N, 1, lambda) + 1;
y <- rnorm(N, mu[z], sigma[z]);

stan_rdump(c("N", "y"), file="mix.data.R")

singular_fit <- stan(file='..\\mix_betancourt_fun.stan', data=input_data[1:2],
chains = 3, iter = 600, warmup = 200,
control = list(adapt_delta = 0.9),
seed = 483892929)

singular_fit

singular_fit2 <- stan(file='..\\mix_betancourt.stan', data=input_data[1:2],
chains = 3, iter = 600, warmup = 200,
control = list(adapt_delta = 0.9),
seed = 483892929)
singular_fit2
``````

Is it also not the same if you do:

``````real potential_lpdf(real y, vector a, real s, real tau, real mu) {
return normal_lpdf(y | mu, s) + log(1 - exp(-squared_distance(a, a) * tau));
}
``````

That does give the same. Is this an issue where I should be using `log_sum_exp`?