This question arises from quite a complex stan program, but after constant reduction of the model, I’ve ended up with a very simple program, showing that I clearly don’t understand how likelihoods work. Or maybe at least not in stan.

In case of the former, apologies for posting such simple quesitons here.

Here’s the simple problem:

Given three (iid) variables `x1, x2, x3 ~ normal(0, D)`

, I observe only their sum: `s = x1 + x2 + x3`

with some error `n`

, i.e. `s ~ Normal(x1+x2+x3, n)`

.

Given some observations of `s`

I wish to find the most likely value of `D`

. One way to do this is to realize that one can just sample an `x1 ~ Normal(0, sqrt(3)*D)`

and then `s ~ Normal(x1, n)`

. Here’s a stan program:

```
data {
int<lower=1> N;
real<lower=0> D;
real<lower=0> noise;
real s[N];
}
parameters {
real x1[N];
}
model {
// Intialise
for (n in 1:N) {
target += normal_lpdf(x1[n] | 0, sqrt(3)*D);
}
// Measurements
for (n in 1:N) {
s[n] ~ normal(x1[n], noise);
}
}
```

If I optimize or sample for the values of `x1`

for differnent values of `D`

, the maximum log likelihood for both optimizing and sampling gives the correct value of `D`

. Here’s a pystan program to run the above:

```
import numpy as np
import matplotlib.pyplot as plt
import pickle
from pystan import StanModel
import sys, os
if True:
sm = StanModel(file='stan_model3.stan')
with open('model3.pkl', 'wb') as f:
pickle.dump(sm, f)
else:
sm = pickle.load(open('model3.pkl', 'rb'))
N = 500
D = 5.
noise = 0.01
x1 = D*np.random.randn(N)
x2 = D*np.random.randn(N)
x3 = D*np.random.randn(N)
s_true = x1 + x2 + x3
s_mes = s_true + np.random.randn(N)*noise
D_list = np.linspace(0.2*D, 1.5*D, 20)
mean = np.zeros_like(D_list)
mean_sampling = np.zeros_like(D_list)
x1 = D*np.random.randn(N)
x2 = D*np.random.randn(N)
x3 = D*np.random.randn(N)
s_true = x1 + x2 + x3
s_mes = s_true + np.random.randn(N)*noise
# MLE
temp = np.zeros_like(D_list)
try:
for i, D_guess in enumerate(D_list):
print i
data = dict(s=s_mes, N=N, D=D_guess, noise=noise)
opt = sm.optimizing(data=data, iter=10000, as_vector=False)
temp[i] = opt['value']
mean += temp
except:
print 'Exception'
pass
# MCMC
temp = np.zeros_like(D_list)
try:
for i, D_guess in enumerate(D_list):
print i
data = dict(s=s_mes, N=N, D=D_guess, noise=noise)
fit = sm.sampling(data=data)
temp[i] = np.mean(fit['lp__'])
mean_sampling += temp
except:
print 'Exception'
pass
plt.plot(D_list, mean, '-bx')
plt.plot(D_list, mean_sampling, '-rx')
plt.xlabel('D')
plt.ylabel('log P')
plt.show()
```

Now I realize that I could have made `D`

a parameter of the model. But I want the most likely `D`

for any sampling of `x1`

.

This all works very well, but it doesn’t generalize to what I actually need. If instead I write the stan program that explicitly samples `x1, x2, x3`

, like this:

```
data {
int<lower=1> N;
real<lower=0> D;
real<lower=0> noise;
real s[N];
}
parameters {
real x1[N];
real x2[N];
real x3[N];
}
model {
// Intialise
for (n in 1:N) {
target += normal_lpdf(x1[n] | 0, D);
target += normal_lpdf(x2[n] | 0, D);
target += normal_lpdf(x3[n] | 0, D);
}
// Measurements
for (n in 1:N) {
s[n] ~ normal(x1[n] + x2[n] + x3[n], noise);
}
}
```

It no longer works. Suddenly, both for sampling and optimizing, the optimal value of `D`

is wrong. The optimum becomes `D = D_true / sqrt(3)`

.

I’ve used `target += normal_lpdf`

to get the right log likelihoods, but since `noise`

is not varied, `~ normal`

can be used for the final measurement.

I hope the question is clear. I know it’s very simple, but I have debugged for a long time now, and I simply cannot understand why this doesn’t just work.

Thank you in advance.