When there is this logistic regression parameterization in the Stan manual:
data {
int<lower=0> N;
vector[N] x;
int<lower=0,upper=1> y[N];
}
parameters {
real alpha;
real beta;
}
model {
y ~ bernoulli_logit(alpha + beta * x);
}
to what extent is it safe to assume that the model is equivalent to a latent parameterization where there is an implied residual epsilon{i} for each observation, where the residuals are distributed with a variance of π^2/3, as also described in the Austin and Merlo tutorial here? @Bob_Carpenter@andrewgelman
I would say that it is safe. I think it is easiest to understand if you look at the cdf of the logistic distribution (wikipedia link). When you set mu=0 and s=1, then the cdf becomes the inverse logit function and the variance of the logistic distribution is \pi^2/3. Then you have something like
And, therefore, Stan follows convention by setting the sigma to 1 as part of the standard logistic: y ~ logistic (mu, sigma)
And then to confirm, that’s what the manual implies when it says that the noise parameter (sigma, in other words) is built into the bernoulli_logit function, correct?