So, after discovering this thread: Mi() with "non-linear" model?, I decided to continue down the route of using a latent variable to try and place a distribution over slope
.
However, I am running into some strange issues. I first wanted to effectively re-create my simple model, but using a latent variable for slope. For reference, my original model, priors, and fit procedure are:
formula.test.1 <- (
bf(y ~ shift + scale * inv_logit(slope * (t - offset))
, sigma ~ 1 + t:treatment
, offset ~ 1
, scale ~ 1
, shift ~ 1
, slope ~ 1
, family = brmsfamily("gaussian", link='identity', link_sigma="identity")
, nl=TRUE
)
)
prior.test.1 <- c(
# priors for the sigmoid curve:
prior(normal(0.5, 0.2), nlpar='offset')
, prior(normal(1.5, 1), nlpar='scale', lb=0)
, prior(normal(0, 0.2), nlpar='shift')
, prior(normal(3, 2), nlpar='slope', lb=0)
# priors for parameters for sigma
, prior(normal(0, 2), dpar="sigma", lb=0)
, prior(lognormal(-3, 1.5), dpar="sigma", class="Intercept")
)
brm(
data = df.test, formula = formula.test.1, prior = prior.test.1,
# note these are short chains for iteration
iter = 1000, warmup = 500, chains = 2, cores = 2
)
Producing the following, promising model results:
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ shift + scale * inv_logit(slope * (t - offset))
sigma ~ 1 + t:treatment
offset ~ 1
scale ~ 1
shift ~ 1
slope ~ 1
Data: df.test (Number of observations: 780)
Draws: 2 chains, each with iter = 500; warmup = 0; thin = 1;
total post-warmup draws = 1000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_Intercept 0.01 0.00 0.01 0.01 1.01 560 622
offset_Intercept 0.53 0.01 0.52 0.54 1.01 311 416
scale_Intercept 1.05 0.02 1.02 1.10 1.01 253 303
shift_Intercept -0.06 0.01 -0.07 -0.05 1.01 330 504
slope_Intercept 5.44 0.15 5.13 5.72 1.01 263 331
sigma_t:treatmentleft 0.08 0.01 0.07 0.09 1.00 588 579
sigma_t:treatmentneutral 0.08 0.01 0.07 0.09 1.01 533 460
sigma_t:treatmentright 0.08 0.01 0.07 0.09 1.00 555 592
Now, at least conceptually, should the following not be the same?
formula.test.2 <- (
bf(y ~ shift + scale * inv_logit(slope * (t - offset))
, sigma ~ 1 + t:treatment
, offset ~ 1
, scale ~ 1
, shift ~ 1
, slope ~ 0 + mi(slopelatent)
, family = brmsfamily("gaussian", link='identity', link_sigma="identity")
, nl=TRUE
)
+ bf(slopelatent | mi() ~ 1
, family = brmsfamily("gaussian", link="identity", link_sigma="identity"))
+ set_rescor(FALSE)
)
prior.test.2 <- c(
# priors for the sigmoid curve:
prior(normal(0.5, 0.2), nlpar='offset', resp='y')
, prior(normal(1.5, 1), nlpar='scale', resp='y', lb=0)
, prior(normal(0, 0.2), nlpar='shift', resp='y')
# fix b for slope to 1 as the latent parameter will be modelled instead:
, prior(constant(1), nlpar='slope', resp='y', lb=0)
# priors for parameters for sigma
, prior(normal(0, 2), dpar="sigma", resp="y", lb=0)
, prior(lognormal(-3, 1.5), dpar="sigma", resp="y", class="Intercept")
# priors for slope parameter distribution
, prior(normal(3, 2), resp="slopelatent", class="Intercept")
# fix the sigma on this distribution to something very low, to mimic fit.test.1
, prior(constant(0.00001), resp="slopelatent", class="sigma")
)
fit.test.2 <- brm(
data = df.test %>% mutate(slopelatent=as.numeric(NA)),
formula = formula.test.2, prior = prior.test.2,
iter = 1000, warmup = 500, chains = 2, cores = 2
)
The resulting fit however fails completely:
Family: MV(gaussian, gaussian)
Links: mu = identity; sigma = identity
mu = identity; sigma = identity
Formula: y ~ shift + scale * inv_logit(slope * (t - offset))
sigma ~ 1 + t:treatment
offset ~ 1
scale ~ 1
shift ~ 1
slope ~ 0 + mi(slopelatent)
slopelatent | mi() ~ 1
Data: df.test (Number of observations: 780)
Draws: 2 chains, each with iter = 500; warmup = 0; thin = 1;
total post-warmup draws = 1000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_y_Intercept 0.22 0.10 0.10 0.32 3.05 2 11
slopelatent_Intercept 0.02 0.02 0.00 0.03 2.65 2 13
y_offset_Intercept -0.74 0.04 -0.78 -0.70 2.61 2 14
y_scale_Intercept 2.43 0.13 2.30 2.56 2.46 2 11
y_shift_Intercept 1.33 0.30 1.03 1.63 2.24 3 11
sigma_y_t:treatmentleft 2.45 0.86 1.59 3.31 2.93 2 11
sigma_y_t:treatmentneutral 1.68 0.97 0.71 2.67 3.03 2 11
sigma_y_t:treatmentright 3.24 2.40 0.84 5.65 3.05 2 11
y_slope_mislopelatent 1.00 0.00 1.00 1.00 NA NA NA
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_slopelatent 0.00 0.00 0.00 0.00 NA NA NA
In my mind, the two models should behave similarly, since:
- the relationship between slopelatent and slope is fixed (b=1)
- The intercept for slopelatent has the same prior as slope does in the first model
- The scale of the slopelatent distribution is fixed to a very low number
- all values of slopelatent are inferred
Obviously, the resulting stan code looks quite different. Is there an obvious reason why the second model formulation is so much more challenging for Stan to sample?
Thanks,
Jan