Hi all, I have a general question:
I have a non linear model y=a*x^b and a categorical variable ID. Is it possible in brms to estimate the model coefficients a and b for each categorical variable ID?
I was looking at the example of the non linear model, is this a correct form?
Thanks A lot dylanjm.
Running ranef It is a bit confusing for me the fact that the coefficients are called intercepts. Is it assumed a given slope for my a and b coefficients? and if yes, how I can retrive it?
I think your current model only specifies random intercepts and not random slopes. So each ID should have its own intercept but they should all share the Population-Level coefficients. At least that’s my understanding.
Also "a" in this case represents your model intercept, so intercept is the estimate for a for each ID.
Have you considered fitting the log-transformed model (i.e., \log y \sim \log a + b * \log x)? It would be a lot easier to parameterize with group-level effects for a and b (log_y ~ (1 + log_x | ID)), you could back-transform the intercept into a after fitting, and it would probably be more efficient. I guess this would require all y to be positive and need a different likelihood distribution.
Thanks a lot Christopher, unfortunately my non linear model is slightly more complex than the example I made. Probably this solution is not working for my case. But thanks a lot,
Thanks again dylanjm. Are the value of ‘a’ and ‘b’ that i retrive from ranef the values I can put in my formula and I can run the model with?I.e for id=1 i have y=a1x^b1, for id=2 y=a2x^b2
, etc…?
Yes, in theory that should be the case. But like I said, if you only have a varying intercept model then you will have a1, a2, a3, … aN, but they will all share the same b value for the slope.
Thanks a lot dylanjm. In the case of my model y=a*x^b a is a non linear parameter so, I was thinking it is not an intercept. Could you please indicate that you also means for slope? in your answer? because b for my model is also a non linear parameter.
I suppose in the formal sense they are not necessarily intercept and slope coefficients. Here is a quick simulation of a non-linear model without using a multi-level structure.
library(tidyverse)
library(brms)
N <- 1200
x <- runif(N, 1, 20)
a <- 50
b <- -1.3
y_true <- a * x^b
sigma <- 3
y <- rnorm(N, y_true, sigma)
dat <- tibble(x, y)
fit <- brm(bf(y ~ a*x^b, nl = TRUE,
a ~ 1, b ~ 1),
data = dat, family = gaussian(),
prior = c(prior(normal(40, 10), nlpar = "a"),
prior(normal(-2, 1), nlpar = "b")),
cores = 24)
summary(fit)
#> Family: gaussian
#> Links: mu = identity; sigma = identity
#> Formula: y ~ a * x^b
#> a ~ 1
#> b ~ 1
#> Data: dat (Number of observations: 1200)
#> Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup samples = 4000
#>
#> Population-Level Effects:
#> Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
#> a_Intercept 50.56 0.66 49.24 51.83 2169 1.00
#> b_Intercept -1.33 0.02 -1.37 -1.30 2231 1.00
#>
#> Family Specific Parameters:
#> Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
#> sigma 3.01 0.06 2.90 3.13 3038 1.00
#>
#> Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
#> is a crude measure of effective sample size, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
plot(marginal_effects(fit), points = TRUE)
I think I understand better your question and the confusion of intercept in the model output. The population-level effects show an a_Intercept and b_Intercept those are just the estimates of your coefficients in your non-linear model.
In this case a is the “intercept” at the point the data begins and b is the curving coefficient. I believe that brms labels them as x_Intercept because you are essentially fitting two intercept only models on each separate non-linear coefficient (hence a ~ 1 and b ~ 1 in the formula statement). I could be wrong on this though.
In the case of applying a random-intercept model, you will get many different values for a_Intercept but they will all share the same b_Intercept value. This is my vague understanding as I don’t have much experience in multi-level modeling.
Thanks dylanjm it was very very useful for me and I hope also for the other users. quoting your text: " In the case of applying a random-intercept model, you will get many different values for a_Intercept but they will all share the same b_Intercept value. "
Is this true if my prior is a ~ 1+(1|ID), b ~ 1 whereas is not true if my prior is a ~ 1+(1|ID), b ~ 1+(1|ID) because in the latter case also b is depending on the class ID
Am I right?
Dear All, thanks a lot.
I have tried (with no success) to extend the toy-model of the example to the case of multi-level model. Can you please have a look to it? Thanks a lot and best regards
x1 <- runif(N, 1, 20)
a1 <- 50
b1 <- -1.3
sigma1 <- 3
id1=1
y_true1 <- a1 * x1^b1
y1 <- rnorm(N, y_true1, sigma1)
x2 <- runif(N, 1, 20)
a2 <- 20
b2 <- -0.3
id2=2
y_true2 <- a2 * x2^b2
sigma2=0.6
y2 <- rnorm(N, y_true2, sigma2)
x=c(x1,x2)
y=c(y1,y2)
ID=c(rep(id1,N),rep(id2,N))
dat <- tibble(x, y,ID)
fit <- brm(bf(y ~ a*x^b, nl = TRUE,
a ~ 1+(1|ID), b ~ 1+(1|ID)),
data = dat, family = gaussian(),
prior = c(prior(normal(40, 10), nlpar = "a"),
prior(normal(-2, 1), nlpar = "b")))
summary(fit)
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ a * x^b
a ~ 1 + (1 | ID)
b ~ 1 + (1 | ID)
Data: dat (Number of observations: 2400)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Group-Level Effects:
~ID (Number of levels: 2)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(a_Intercept) 16.94 7.15 8.01 35.22 604 1.01
sd(b_Intercept) 1.70 1.16 0.35 4.65 313 1.03
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
a_Intercept 37.22 6.91 23.93 51.19 502 1.00
b_Intercept -1.35 0.72 -3.03 -0.12 297 1.02
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma 2.15 0.03 2.09 2.21 880 1.00
Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Warning message:
There were 74 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.
See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Hi Paul, and thanks again for your help.
I was expecting to have something around a=50 and b =-1.3 for the ID1 and a=20 and b=-0.3 for the ID2.
At this point I think is me not able yet to read the model output in the correct way, can you help me in that?
The output from summary just gives the population-level effects on the estimates, which are ~37 for a and ~-1.3 for b. However, what I think you’re looking for is the combined effect of the population-level effects and random effects due to grouping by ID. To see these results, do the following:
You can see here the values of a at ~50 and b at ~-1.3 for ID1 and a at ~20 and b at ~-0.3 for ID2. If you’re interested in just the random effects without the population estimates, replace coef with ranef above.