I have been futzing around with a model for a while using simulated data, and I cannot recovered the exact parameters I set using the model. However, I am recovering the correct differences. I am not sure if I am messing up the model, or if I am doing everything correctly and my expectations for this model was incorrect.
Essentially, I want to use a model to determine how a behavior changes with the addition of a treatment. The behavioral response is a continuous measurement between 0 and 1 (or 0.99 to avoid certain math errors with a logit link), which I’ve simulated using the beta2 function in rethinking. In the absence of a treatment, the behavioral response should center around .4. With the addition of a treatment, the response should shift up by .1 (low quality treatments) or .3 (high quality treatments). This is how I’ve simulated the data in R:
#create df with 4 treatments, 6 replicates per treatment
treat <- c("A", "B", "C", "D")
ID <- c(1:6)
df <- expand_grid(treat, ID)
#expand each trial for 10 rows, representing 10 measurements per replicate
df <- df %>%
slice(rep(1:n(), 10))
#simulate behavioral response in absence of treatment
#simulating from beta to only get values between 0 and 1
df$absent <- rbeta2(n=240, .4 , 10)
#Simulate behavioral response with treatment
df <- df %>%
mutate(m_s = case_when(
treat== "A" ~ .3,
treat == "B" ~ .3,
treat == "C" ~ .1,
treat == "D" ~ .1)
)
df$present <- rbeta2(n=240, (.4 + df$m_s), 10)
#Replace "1" with ".99" in absent and present to avoid math errors
df$absent[df$absent == 1] <- .99
df$present[df$present == 1] <- .99
Then I tried to recover the m_s values using this model:
parameters{
real a;
vector[4] t;
real<lower=0> scale;
}
model{
real mu;
vector[240] mu2;
scale ~ gamma( 15/1 , 1/1 );
s ~ normal( 1 , 1 );
a ~ normal( -0.4 , 0.5 );
for ( i in 1:240 ) {
mu2[i] = a + s[species[i]];
mu2[i] = inv_logit(mu2[i]);
}
mu = a;
mu = inv_logit(mu);
present ~ beta( mu2*scale , (1-mu2)*scale );
absent ~ beta( mu*scale , (1-mu)*scale );
}
generated quantities{
vector[240] log_lik;
real mu;
vector[240] mu2;
for ( i in 1:240 ) {
mu2[i] = a + t[treat[i]];
mu2[i] = inv_logit(mu2[i]);
}
mu = a;
mu = inv_logit(mu);
for ( i in 1:240 ) log_lik[i] = beta_lpdf( absent[i] | mu*scale , (1-mu)*scale );
}
However, every time I run this the output values for t, which should correspond to my m_s values when I simulated the data, are off by ~.4 (after accounting for the logit link). The difference between treatments is the same, ~.2. I think my models estimates of t do not separate out the intercept, a, and I have no idea how to get the model to do so. Or even if it’s supposed to do so–I always assumed that models would estimate t given intercept, a, but maybe I’m wrong? I’m concerned if I can’t figure this out, I won’t understand my results when I apply the model to real data where I don’t know the true parameters. Does anyone know what’s going on here? Is the model the problem, or my expectations of the model?