Hi:
I am fitting a model using as link function the logit. Moreover, one of my covariates (cov_1_for_y) for the outcome y has some values unobserved, and I am imputing those when unobserved using a logit as well.
This model works; however, what I would like to do is to move from logit links to log links for the model of the outcome of interest (y) as well as for the imputation model.
Question 1. How can I set the initial values to be between 0,1 when I am using the log link. I am getting the same error described here: How to set initial value to make probability parameter be in the interval [0, 1] in a stan model but I do not feel confident enough to adapt the response to my own problem. I put a comment on that topic but nobody responded so now I will try as a new topic.
Question 2. How do I correctly move to log link in the case of the imputation piece. For the modeling part seems to work just fine when I use only complete data; however, I get the error from question 1 when I expand the model to have more covariates. I have left commented pieces of code on how I am doing the log link between the y outcome and the covariates as well as the way I “think” the log link should be accomplished in the imputation piece.
Feedback is really appreciated
modelstring <- "
data{
int N; //number of observations
int L; //number of unique groups 1
int y[N]; // outcome
//covariate with some values unobserved
int cov_1_for_y[N]; // categorical variable: 0 if baseline category, 1 otherwise, (-1) if unobserved
int cov_1_for_y_miss[N]; // categorical variable when observed == 1, 0 otherwise
//covariate
int cov_2_for_y[N]; //covariate with all values observed
// index of unique groups 1
int index_groups_1[N];
//data to impute cov_1_for_y
real cov_imp[N];
}
parameters{
real <lower=-10,upper=10> alpha;
real <lower=-10,upper=10> beta_cov_1_for_y; // coefficient
real <lower=-10,upper=10> beta_cov_2_for_y; // coefficient
real <lower=-10,upper=10> a_imp; // intercept for the imputation model
real <lower=-10,upper=10> b_cov_imp; // coefficent for the imputation model
real<lower=0,upper=100> sigma_theta_groups_1; // standard deviation across groups 1 - specific intercepts
vector[L] epsilon_groups_1;
vector[L] mu;
}
transformed parameters {
vector[L] theta_groups_1;
theta_groups_1 = mu + sigma_theta_groups_1 * epsilon_groups_1;
}
model{
// priors
alpha ~ normal(0,10); // explained above
beta_cov_1_for_y ~ normal(0,10); // explained above
beta_cov_2_for_y ~ normal(0,10); // explained above
a_imp ~ normal(0,10); // explained above
b_cov_imp ~ normal(0,10); // explained above
epsilon_groups_1~normal(0,1);
sigma_theta_groups_1~cauchy(0,15); //standard deviation across groups 1 specific intercepts
for(l in 1:L){
mu[l]~normal(0,1);
}
//add Data
for (i in 1:N) {
vector[2] p;
p[1] = a_imp + b_cov_imp*cov_imp[i]; // modeling the prob 2 as a function of the covariate to model the category
p[2]=1-p[1];
if (cov_1_for_y_miss[i] == 0) {
y[i] ~ bernoulli_logit(alpha+
beta_cov_1_for_y*cov_1_for_y[i]+
beta_cov_2_for_y*cov_2_for_y[i]+
theta_groups_1[index_groups_1[i]]);
// log link
//y[i] ~ bernoulli(exp(alpha+
// beta_cov_1_for_y*cov_1_for_y[i]+
// beta_cov_2_for_y*cov_2_for_y[i]+
// theta_groups_1[index_groups_1[i]]));
cov_1_for_y[i] ~ bernoulli(softmax(p)[1]);
//log link?
//cov_1_for_y[i] ~ bernoulli(exp(p[1]));
}
// x missing cov_1_for_y_miss
else {
vector[2] log_lik_cats; // vector to hold the log probabilities for each alternate scenario (category 1, 2, or 3)
log_lik_cats[1] = log_softmax(p)[1] + bernoulli_logit_lpmf( y[i] | alpha + beta_cov_1_for_y + beta_cov_2_for_y*cov_2_for_y[i] + theta_groups_1[index_groups_1[i]]); // category 1: recently weaned
log_lik_cats[2] = log_softmax(p)[1] + bernoulli_logit_lpmf( y[i] | alpha + beta_cov_2_for_y*cov_2_for_y[i] + theta_groups_1[index_groups_1[i]]); // category 0: not recently weaned
//log link?
//log_lik_cats[1] = log(p[1]) + bernoulli_lpmf( y[i] | exp(alpha + beta_cov_1_for_y + beta_cov_2_for_y*cov_2_for_y[i] + theta_groups_1[index_groups_1[i]])); // category 1: recently weaned
//log_lik_cats[2] = log(p[1]) + bernoulli_lpmf( y[i] | exp(alpha + beta_cov_2_for_y*cov_2_for_y[i] + theta_groups_1[index_groups_1[i]])); // category 0: not recently weaned
target += log_sum_exp(log_lik_cats); // sum log probabilities across the scenarios (i.e., marginalize over missingness)
}
}
} // close model block
