Hi all,
I’m having some difficulties with a logit choice model and was hoping to get some help. First, I’m new to stan, so I was hoping to see if I set my model up correctly! Second, I was trying to show that truncating the priors would be bad for the model, but I’m not finding what I expected.
I created synthetic data for this example based on an economic model I’m using. Basically, the data consists of synth_N individuals who each make a choice between a pen and a mug. The valuation of the pen and the mug depends on several factors: 1) the intrinsic value (unknown and to be estimated, but for the synthetic data I fix these in synth_mug_u and synth_pen_u), 2) a parameter \lambda (to be estimated, drawn from a lognormal in the synthetic data), 3) which of the objects they are endowed with (this is observable data), and 4) a treatment condition q (observable). The overall valuations for each individual are computed as pen_utils and mug_utils below. Each individual then has a choice (observed data) which reflects whether the subject traded their endowed good for the alternative.
set.seed(19)
# Synthetic Data To Test Truncated vs Not
synth_N = 400
synth_exchange <- tibble(id = seq_len(synth_N),
synth_mug_u = 1,
synth_pen_u = rnorm(synth_N, 0.62, 0.1),
endowment = sample(c(1,2), synth_N, 0.5),
treatment = sample(c(0,1), synth_N, 0.5),
lambda = rlnorm(synth_N, 0.17, exp(-0.62)),
q = 0.9 * (1 - treatment) + 0.1 * (treatment),
pen_utils = (endowment == 2) * synth_pen_u +
(endowment == 1)*
(q*synth_pen_u + (1 -q)*synth_mug_u
+ q*(1 - q)*(1-lambda)*(synth_mug_u + synth_pen_u)),
mug_utils = (endowment == 1) * synth_mug_u +
(endowment == 2)*
(q*synth_mug_u + (1 -q)*synth_pen_u
+ q*(1 - q)*(1-lambda)*(synth_mug_u + synth_pen_u)),
logit_noise = rlogis(synth_N),
prefer_mug = mug_utils - pen_utils > logit_noise,
exchange = prefer_mug * (endowment ==2) +
(1-prefer_mug)*(endowment == 1)
) %>%
group_by(endowment, treatment) %>%
mutate(group_id = cur_group_id()) %>%
ungroup()
synth_dat <- list(
N = nrow(synth_exchange),
exchange = synth_exchange$exchange,
treatment = synth_exchange$treatment,
q = synth_exchange$q,
endowment = synth_exchange$endowment,
id = synth_exchange$id,
G = nrow(distinct(synth_exchange, group_id)),
group_id = synth_exchange$group_id
)
With this synthetic data, my goal is to estimate the distribution of \lambda in the population, as well as the relative utility of the two goods (in this example, the distribution of synth_pen_u). I set my model up as a logit choice model, as shown below, where the probability of “exchanging” is a function of the parameters described in the synthetic data. I think that by fixing the mug_utils to 1, both parameters are identified (but could be wrong).
data{
int N;
int G;
int group_id[N];
int exchange[N];
int treatment[N];
int endowment[N];
real q[N];
}
parameters{
real<lower=0, upper= 6> lambda;
real<lower=0, upper = 10> utils_pen;
}
model{
vector[N] kr_utils_mug;
vector[N] kr_utils_pen;
vector[N] p;
real utils_mug = 1;
//Priors
utils_pen ~ normal(utils_mug, 0.25);
lambda ~ lognormal( 0.168 , exp(-0.62));
for ( i in 1:N ) {
kr_utils_mug[i]= utils_mug * (endowment[i] == 1) +
(q[i] * utils_mug + (1 - q[i]) * utils_pen +
q[i] * (1 - q[i]) * (1 - lambda) * (utils_mug + utils_pen)) *
(endowment[i] == 2) ;
kr_utils_pen[i]= utils_pen * (endowment[i] == 2) +
(q[i] * utils_pen + (1 - q[i])*utils_mug +
q[i] * (1 - q[i]) * (1 - lambda) * (utils_mug + utils_pen)) *
(endowment[i] == 1);
// Probability of Exchanging
p[i] = inv_logit(kr_utils_mug[i] - kr_utils_pen[i]) * (endowment[i] == 2) +
inv_logit(kr_utils_pen[i] - kr_utils_mug[i]) * (endowment[i] == 1);
}
exchange ~ binomial(1, p);
}
generated quantities {
vector[N] kr_utils_mug;
vector[N] kr_utils_pen;
vector[N] p;
vector[N] log_lik;
real utils_mug = 1;
for ( i in 1:N ) {
kr_utils_mug[i]= utils_mug * (endowment[i] == 1) +
(q[i] * utils_mug + (1 - q[i]) * utils_pen +
q[i] * (1 - q[i]) * (1 - lambda) * (utils_mug + utils_pen)) *
(endowment[i] == 2) ;
kr_utils_pen[i]= utils_pen * (endowment[i] == 2) +
(q[i] * utils_pen + (1 - q[i])*utils_mug +
q[i] * (1 - q[i]) * (1 - lambda) * (utils_mug + utils_pen)) *
(endowment[i] == 1);
p[i] = inv_logit(kr_utils_mug[i] - kr_utils_pen[i]) * (endowment[i] == 2) +
inv_logit(kr_utils_pen[i] - kr_utils_mug[i]) * (endowment[i] == 1);
}
for(i in 1:N) log_lik[i] = binomial_lpmf(exchange[i] | 1, p[i]);
}
Does this seem like the right way to estimate the \lambda and utils_pen distributions implied by the data? I can also do an aggregated version (groups are defined by endowment and treatment, and I can look at the sum of exchanges vs the total group size), but I don’t see why it would be different.
As for the model comparison – I was hoping to show that truncating \lambda to be larger than 1 (real<lower=1, upper= 6> lambda;) while maintaining the prior would lead to a drop in predictive ability. I do this by estimating the model with and without this change, and then using loo_compare. However, I’m finding that the truncated model may be slightly better (often not more than 1 SE, though). Why would that be, especially given that 1) \lambda < 1 exist in the synthetic data and 2) the non-truncated model is less restrictive, so if having no \lambda <1 did improve performance, the non-truncated model would be able to do that regardless?
Thanks!
Alex