I used brms to implement a 2-level hierarchical logistic regression model.
I would like to describe the model details in my paper:
- Please confirm if the model specification and math description match
- What is the group-level design matrix Z? An example would make it clear.
The model specification is as follows:
fit <- brm(data = data,
family = bernoulli,
formula = y ~ 1 + X1 + X2 + (1 + X1 + X2 || group),
prior = c(prior(normal(0, 5), class = Intercept),
prior(cauchy(0, 5), class = sd),
prior(normal(0, 5), class = b))
)
The model (math) description that I have so far:
\\ 2-level hierarchical logistic regression
\\ i denotes the sample index
\\ y is the binary outcome
\\ group(i) denotes the group to which the sample i belongs
\\ X[i] is the design matrix for sample i
\\ ??? Z[group(i)] is the group-level design matrix ???
y[i] ~ Bernoulli(p[i])
p[i] = inverseLogit(beta0 + beta*X[i] +
u0[group(i)] + u[group(i)]*Z[group(i)])
\\population-level intercept and coefficients
\\ k denotes the covariate index
beta0 ~ Normal(0,5)
beta[k] ~ Normal(0,5)
\\ group-level intercept and coefficients
\\ j denotes the group index, k denotes the covariate index
u0[j] ~ Normal(0, sd^2)
u[j][k] ~ Normal(0, sd^2)
\\ group-level standard deviation
sd ~ HalfCauchy(0, 5)
Your help is much appreciated.
- Operating System: Windows 10
- brms Version: 2.10.0
I studied a bit more (mainly the recent paper by @paul.buerkner at https://arxiv.org/pdf/1905.09501.pdf) and noticed that the group-level design matrix Z consists of the group-varying variables and in the math description above, Z should be indexed by the sample index i instead of the group index group(i); which is obvious when inspecting the model’s Stan code. So, based on the brms model above, X and Z matrices should be identical, as both X1 and X2 variables are used for both population- and group-level models. Would anyone please confirm that X and Z are identical in this case?
The revised math description of the model above is as follows:
\\ i denotes the sample index
\\ y is the binary outcome
\\ group(i) denotes the group to which the sample i belongs
\\ X[i] is the design matrix for sample i
\\ Z[i] is the group-level design matrix for sample i
y[i] ~ Bernoulli(p[i])
p[i] = inverseLogit(beta0 + beta*X[i] +
u0[group(i)] + u[group(i)]*Z[i])
\\population-level intercept and coefficients
\\ k denotes the covariate index
beta0 ~ Normal(0,5)
beta[k] ~ Normal(0,5)
\\ group-level intercept and coefficients
\\ j denotes the group index, k denotes the covariate index
u0[j] ~ Normal(0, sd^2)
u[j][k] ~ Normal(0, sd^2)
\\ group-level standard deviation
sd ~ HalfCauchy(0, 5)
Thank you,
Amin
1 Like
Z has higher dimension since it does contains columns for every group and every coefficient not just for every coefficient. In other words Z is much bigger than X but sparse. However I think that this sparse representation of Z is not necessarily helpful to understand what is happening.