I’m trying to understand some aberrant estimates from my model. I’m working through my workflow with the simplest model on simulated data (below). I can make the problem “go away” with more tightly defined priors, but I don’t know if that’s the right move. I’d eventually like to make a hierarchical model, so I thought it worth trying to understand before adding more complexity.
I want to fit the model:
such that
where r, g, and \gamma are (non-negative) parameters to be estimated.
By the properties of the log-normal distribution:
Therefore,
Clearly, \mu is defined only when the argument to the logarithm is positive. In other words, r,g>\gamma. I will constrain those in the parameters
block with <lower=gamma>
. (I think this is relevant.)
The Stan code below simulates 1,000 observations in the transformed data
block, and attempts to recover the seeded parameter values (\gamma=0.05,g=0.10, r=0.15) in the model
block.
transformed data {
int N = 1000;
// Seed parameter values
real gamma_ = 0.05;
real g_ = 0.10;
real r_ = 0.15;
real sigma_ = 1;
// Expected value
real y_bar = (g_-gamma_)/(r_-gamma_);
// Draw observations from lognormal
vector[N] y; for (n in 1:N) y[n] = lognormal_rng(log(y_bar)-sigma_^2/2,sigma_);
}
parameters {
real<lower=0> gamma;
real<lower=gamma> g;
real<lower=gamma> r;
real<lower=0> sigma;
}
transformed parameters {
real<lower=0> mu_hat = (g-gamma)/(r-gamma) ;
}
model {
// Priors
gamma ~ gamma(1,1);
g ~ gamma(1,1);
r ~ gamma(1,1);
sigma ~ gamma(1,1);
// Likelihood
y ~ lognormal(log(mu_hat)-sigma^2/2,sigma);
}
In the above, the priors are defined such that the prior mean is 1. I would hope the posterior mean converges on the seeded values. The results are not great:
4 chains, each with iter=1000; warmup=500; thin=1;
post-warmup draws per chain=500, total post-warmup draws=2000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
gamma 0.33 0.01 0.33 0.01 0.10 0.23 0.46 1.20 1005 1
g 0.99 0.02 0.57 0.18 0.57 0.89 1.32 2.37 758 1
r 1.68 0.04 1.02 0.31 0.94 1.49 2.20 4.20 715 1
sigma 1.00 0.00 0.02 0.96 0.99 1.00 1.02 1.05 1172 1
The posterior for \gamma at least moved in the right direction. The posterior for g not so much. But the posterior for r has moved further away from the true value… by a lot. That’s what startles me.
The pairs plot indicates high collinearity between g and r:
I can improve the estimates with more tightly defined priors.
model {
// Priors
gamma ~ gamma(1,10);
g ~ gamma(1,10);
r ~ gamma(1,10);
sigma ~ gamma(1,1);
// ...
Then the results look much better (the pairs plot looks the same), but a similar patter emerges. The priors are defined such that the prior mean is 0.10. The posterior for \gamma moved down in the right direction. The posterior for g didn’t really need to move. The posterior for r again increased, but at least this time towards the true value.
4 chains, each with iter=1000; warmup=500; thin=1;
post-warmup draws per chain=500, total post-warmup draws=2000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
gamma 0.03 0 0.03 0.00 0.01 0.02 0.05 0.12 1399 1.00
g 0.10 0 0.06 0.02 0.06 0.09 0.13 0.24 828 1.01
r 0.17 0 0.10 0.03 0.10 0.15 0.22 0.42 710 1.00
sigma 0.98 0 0.02 0.94 0.97 0.98 1.00 1.02 1029 1.00
Is this just a matter of “tuning” the priors? Or am I covering up a problem, since I didn’t get the results I wanted, so I just ratchet the priors 'till I get there?