In a lot of situations it’s easier to put priors on parameter transformations than on the immediate model parameters. When one does so, one is always greeted with the Stan warning that one should include, “a target += statement with the log absolute determinant of the Jacobian of the transform.” Now I’ve read on other threads that this particular warning is sometimes warranted and sometimes not. My question is: when the warning is warranted, just HOW exactly does one augment the += statement? What does one write exactly? Here’s my example using logistic regression.
The following generates some dummy data in R where I’ve got one continuous (age) and one categorical (group) variable.
n <- 10; beta0 <- -1.6; beta1 <- 0.03; beta2 <- 1
age <- runif(n=n, min=18, max=60)
x2 <- runif(n=n, min=0, max=1)
pi <- exp(beta0 + beta1*age + beta2*x2) / (1 + exp(beta0 + beta1*age + beta2*x2))
y <- rbinom(n=n, size=1, prob=pi)
group <- as.integer(cut(x2, breaks=3))
mylist <- list(age=age,group=group,y=y)
mylist$N <- length(y)
mylist$N_group <- 3
And here’s the Stan code where I’m trying to model the Bernoulli outcome y. Note that I’m putting a somewhat loose prior on the immediate group parameter b but I’m putting rather strong priors on the three group proportions defined in the transformed parameters block. That is, I have a hard time thinking in terms of prior beliefs about parameters expressed inside a logistic transformation, let alone ones in concert with other variables inside a logistic model. It’s much easier to elicit priors regarding the summary proportions or shares. Bless Stan for the handy transformed parameters block that lets me do just that:
stan.code ='
data{
int<lower=1> N;
int<lower=1> N_group;
int<lower=0, upper=1> y[N];
int group[N];
real<lower=0, upper=100> age[N];
}
parameters{
real a;
vector[N_group] b;
real intercept;
}
transformed parameters {
real<lower = 0, upper = 1> Group1Share;
real<lower = 0, upper = 1> Group2Share;
real<lower = 0, upper = 1> Group3Share;
Group1Share = inv_logit(intercept + b[1]);
Group2Share = inv_logit(intercept + b[2]);
Group3Share = inv_logit(intercept + b[3]);
}
model{
vector[N] p;
intercept ~ normal(0, 2);
b ~ normal(0, 5);
a ~ normal(0, 2);
Group1Share ~ normal(.333, .1);
Group2Share ~ normal(.333, .1);
Group3Share ~ normal(.333, .1);
for (i in 1:N) {
y[i] ~ bernoulli_logit(intercept + a*age[i] + b[group[i]]);
}
}
'
Now I can fit this model and examine the results with:
model = stan(model_code=stan.code, data=mylist, iter=2000, thin=10, chains=4)
traceplot(model)
pairs(model)
print(model, digits = 3)
Where Stan gives me the warning I’d expect:
Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable.
If it does, you need to include a target += statement with the log absolute determinant of the Jacobian of the transform.
Left-hand-side of sampling statement
Now in my case, I suspect the warning is warranted. (Am I right?) Here’s the result of a pairs plot:
It’s clear in the panels towards the top-left that the three immediate group parameters (b1, b2, b3) are almost perfectly inversely correlated with the intercept parameter. I may not be saying this quite right, but it appears that there’s some underdetermination going on. There are an infinite number of combinations wherein these 4 parameters can produce the same 3 prior share beliefs. I think this is what the Stan warning is trying to tell me. And according to this deeper warning explanation, I may not be sampling from the posterior density that I expect. By contrast, the pairs plot for the three b-parameters doesn’t look at all like the above if I drop the transformed parameters altogether.
But then again, the three resulting transformed parameter estimates that come from this model do exactly what I expect. They’re all around .33 and reflect my prior beliefs. All the other parameters make entirely intuitive sense. And the traceplot shows no cause for alarm. If it weren’t for the strong parameter correlations, I’d think that the model is just fine.
So, my questions are… First, is my understanding indeed correct? Is the Stan warning in this case valid and am I not sampling the density I expect? Do I need to append my += target in some way to properly sample from the posterior density reflecting my prior beliefs about the group proportions? Second, if so, what exactly should I append? For example, if I re-write my model block to use the the following form (which right now produces the same results as above)…
for (i in 1:N) {
lps[i] = bernoulli_logit_lpmf(y[i] | (intercept + a*age[i] + b[group[i]]));
}
target += sum(lps);
…Then what’s the syntax I should be adding to the target += statement above? Would I need to add the sum of the log derivatives of my transformed parameters?
Thanks in advance.