Distribution regression for non-linear parameters

janfreyberg · February 20, 2022, 3:55pm

So, after discovering this thread: Mi() with "non-linear" model?, I decided to continue down the route of using a latent variable to try and place a distribution over slope.

However, I am running into some strange issues. I first wanted to effectively re-create my simple model, but using a latent variable for slope. For reference, my original model, priors, and fit procedure are:

formula.test.1 <- (
  bf(y ~ shift + scale * inv_logit(slope * (t - offset))
     , sigma ~ 1 + t:treatment
     , offset ~ 1
     , scale ~ 1
     , shift ~ 1
     , slope ~ 1
     , family = brmsfamily("gaussian", link='identity', link_sigma="identity")
     , nl=TRUE
  )
)

prior.test.1 <- c(
    # priors for the sigmoid curve:
    prior(normal(0.5, 0.2), nlpar='offset')
    , prior(normal(1.5, 1), nlpar='scale', lb=0)
    , prior(normal(0, 0.2), nlpar='shift')
    , prior(normal(3, 2), nlpar='slope', lb=0)
    # priors for parameters for sigma
    , prior(normal(0, 2), dpar="sigma", lb=0)
    , prior(lognormal(-3, 1.5), dpar="sigma", class="Intercept")
)

brm(
    data = df.test, formula = formula.test.1, prior = prior.test.1,
     # note these are short chains for iteration
    iter = 1000, warmup = 500, chains = 2, cores = 2
  )

Producing the following, promising model results:

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ shift + scale * inv_logit(slope * (t - offset)) 
         sigma ~ 1 + t:treatment
         offset ~ 1
         scale ~ 1
         shift ~ 1
         slope ~ 1
   Data: df.test (Number of observations: 780) 
  Draws: 2 chains, each with iter = 500; warmup = 0; thin = 1;
         total post-warmup draws = 1000

Population-Level Effects: 
                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_Intercept              0.01      0.00     0.01     0.01 1.01      560      622
offset_Intercept             0.53      0.01     0.52     0.54 1.01      311      416
scale_Intercept              1.05      0.02     1.02     1.10 1.01      253      303
shift_Intercept             -0.06      0.01    -0.07    -0.05 1.01      330      504
slope_Intercept              5.44      0.15     5.13     5.72 1.01      263      331
sigma_t:treatmentleft        0.08      0.01     0.07     0.09 1.00      588      579
sigma_t:treatmentneutral     0.08      0.01     0.07     0.09 1.01      533      460
sigma_t:treatmentright       0.08      0.01     0.07     0.09 1.00      555      592

Now, at least conceptually, should the following not be the same?

formula.test.2 <- (
  bf(y ~ shift + scale * inv_logit(slope * (t - offset))
     , sigma ~ 1 + t:treatment
     , offset ~ 1
     , scale ~ 1
     , shift ~ 1
     , slope ~ 0 + mi(slopelatent)
     , family = brmsfamily("gaussian", link='identity', link_sigma="identity")
     , nl=TRUE
  )
  + bf(slopelatent | mi() ~ 1
       , family = brmsfamily("gaussian", link="identity", link_sigma="identity"))
  + set_rescor(FALSE)
)

prior.test.2 <- c(
    # priors for the sigmoid curve:
    prior(normal(0.5, 0.2), nlpar='offset', resp='y')
    , prior(normal(1.5, 1), nlpar='scale', resp='y', lb=0)
    , prior(normal(0, 0.2), nlpar='shift', resp='y')
    # fix b for slope to 1 as the latent parameter will be modelled instead:
    , prior(constant(1), nlpar='slope', resp='y', lb=0)
    # priors for parameters for sigma
    , prior(normal(0, 2), dpar="sigma", resp="y", lb=0)
    , prior(lognormal(-3, 1.5), dpar="sigma", resp="y", class="Intercept")
    # priors for slope parameter distribution
    , prior(normal(3, 2), resp="slopelatent", class="Intercept")
    # fix the sigma on this distribution to something very low, to mimic fit.test.1
    , prior(constant(0.00001), resp="slopelatent", class="sigma")
)

fit.test.2 <- brm(
    data = df.test %>% mutate(slopelatent=as.numeric(NA)),
    formula = formula.test.2, prior = prior.test.2,
    iter = 1000, warmup = 500, chains = 2, cores = 2
  )

The resulting fit however fails completely:

 Family: MV(gaussian, gaussian) 
  Links: mu = identity; sigma = identity
         mu = identity; sigma = identity 
Formula: y ~ shift + scale * inv_logit(slope * (t - offset)) 
         sigma ~ 1 + t:treatment
         offset ~ 1
         scale ~ 1
         shift ~ 1
         slope ~ 0 + mi(slopelatent)
         slopelatent | mi() ~ 1 
   Data: df.test (Number of observations: 780) 
  Draws: 2 chains, each with iter = 500; warmup = 0; thin = 1;
         total post-warmup draws = 1000

Population-Level Effects: 
                           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_y_Intercept              0.22      0.10     0.10     0.32 3.05        2       11
slopelatent_Intercept          0.02      0.02     0.00     0.03 2.65        2       13
y_offset_Intercept            -0.74      0.04    -0.78    -0.70 2.61        2       14
y_scale_Intercept              2.43      0.13     2.30     2.56 2.46        2       11
y_shift_Intercept              1.33      0.30     1.03     1.63 2.24        3       11
sigma_y_t:treatmentleft        2.45      0.86     1.59     3.31 2.93        2       11
sigma_y_t:treatmentneutral     1.68      0.97     0.71     2.67 3.03        2       11
sigma_y_t:treatmentright       3.24      2.40     0.84     5.65 3.05        2       11
y_slope_mislopelatent          1.00      0.00     1.00     1.00   NA       NA       NA

Family Specific Parameters: 
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_slopelatent     0.00      0.00     0.00     0.00   NA       NA       NA

In my mind, the two models should behave similarly, since:

the relationship between slopelatent and slope is fixed (b=1)
The intercept for slopelatent has the same prior as slope does in the first model
The scale of the slopelatent distribution is fixed to a very low number
all values of slopelatent are inferred

Obviously, the resulting stan code looks quite different. Is there an obvious reason why the second model formulation is so much more challenging for Stan to sample?

Thanks,
Jan

Topic		Replies	Views
Specifying several families in a brms model brms specification	3	852	October 20, 2020
Advice on a non linear regression model (brms) brms	2	632	February 28, 2019
Non-linear Logistic Regression with brms Modeling specification	1	696	December 19, 2020
Non-linear models using family = Beta() Modeling rstan , brms	22	2042	June 23, 2021
Non-normally distributed predictor and outcome variables brms	4	1259	November 9, 2019

Distribution regression for non-linear parameters

Related topics