This question arises from quite a complex stan program, but after constant reduction of the model, I’ve ended up with a very simple program, showing that I clearly don’t understand how likelihoods work. Or maybe at least not in stan.
In case of the former, apologies for posting such simple quesitons here.
Here’s the simple problem:
Given three (iid) variables x1, x2, x3 ~ normal(0, D)
, I observe only their sum: s = x1 + x2 + x3
with some error n
, i.e. s ~ Normal(x1+x2+x3, n)
.
Given some observations of s
I wish to find the most likely value of D
. One way to do this is to realize that one can just sample an x1 ~ Normal(0, sqrt(3)*D)
and then s ~ Normal(x1, n)
. Here’s a stan program:
data {
int<lower=1> N;
real<lower=0> D;
real<lower=0> noise;
real s[N];
}
parameters {
real x1[N];
}
model {
// Intialise
for (n in 1:N) {
target += normal_lpdf(x1[n] | 0, sqrt(3)*D);
}
// Measurements
for (n in 1:N) {
s[n] ~ normal(x1[n], noise);
}
}
If I optimize or sample for the values of x1
for differnent values of D
, the maximum log likelihood for both optimizing and sampling gives the correct value of D
. Here’s a pystan program to run the above:
import numpy as np
import matplotlib.pyplot as plt
import pickle
from pystan import StanModel
import sys, os
if True:
sm = StanModel(file='stan_model3.stan')
with open('model3.pkl', 'wb') as f:
pickle.dump(sm, f)
else:
sm = pickle.load(open('model3.pkl', 'rb'))
N = 500
D = 5.
noise = 0.01
x1 = D*np.random.randn(N)
x2 = D*np.random.randn(N)
x3 = D*np.random.randn(N)
s_true = x1 + x2 + x3
s_mes = s_true + np.random.randn(N)*noise
D_list = np.linspace(0.2*D, 1.5*D, 20)
mean = np.zeros_like(D_list)
mean_sampling = np.zeros_like(D_list)
x1 = D*np.random.randn(N)
x2 = D*np.random.randn(N)
x3 = D*np.random.randn(N)
s_true = x1 + x2 + x3
s_mes = s_true + np.random.randn(N)*noise
# MLE
temp = np.zeros_like(D_list)
try:
for i, D_guess in enumerate(D_list):
print i
data = dict(s=s_mes, N=N, D=D_guess, noise=noise)
opt = sm.optimizing(data=data, iter=10000, as_vector=False)
temp[i] = opt['value']
mean += temp
except:
print 'Exception'
pass
# MCMC
temp = np.zeros_like(D_list)
try:
for i, D_guess in enumerate(D_list):
print i
data = dict(s=s_mes, N=N, D=D_guess, noise=noise)
fit = sm.sampling(data=data)
temp[i] = np.mean(fit['lp__'])
mean_sampling += temp
except:
print 'Exception'
pass
plt.plot(D_list, mean, '-bx')
plt.plot(D_list, mean_sampling, '-rx')
plt.xlabel('D')
plt.ylabel('log P')
plt.show()
Now I realize that I could have made D
a parameter of the model. But I want the most likely D
for any sampling of x1
.
This all works very well, but it doesn’t generalize to what I actually need. If instead I write the stan program that explicitly samples x1, x2, x3
, like this:
data {
int<lower=1> N;
real<lower=0> D;
real<lower=0> noise;
real s[N];
}
parameters {
real x1[N];
real x2[N];
real x3[N];
}
model {
// Intialise
for (n in 1:N) {
target += normal_lpdf(x1[n] | 0, D);
target += normal_lpdf(x2[n] | 0, D);
target += normal_lpdf(x3[n] | 0, D);
}
// Measurements
for (n in 1:N) {
s[n] ~ normal(x1[n] + x2[n] + x3[n], noise);
}
}
It no longer works. Suddenly, both for sampling and optimizing, the optimal value of D
is wrong. The optimum becomes D = D_true / sqrt(3)
.
I’ve used target += normal_lpdf
to get the right log likelihoods, but since noise
is not varied, ~ normal
can be used for the final measurement.
I hope the question is clear. I know it’s very simple, but I have debugged for a long time now, and I simply cannot understand why this doesn’t just work.
Thank you in advance.