Fit a `brm()` model with an array of predictors

Solomon · January 27, 2021, 3:22am

Operating System: MacOS
brms Version: 2.14.4

In the new Gelman et al (2020) text, we see the following simulation on pp. 159–160.

# define the sample size
N <- 100
# define the number of continuous variables
K <- 10
# make an array of K continuous variables
X <- array(runif(N * K, 0, 1), c(N, K)) 
# make an additional dummy predictor
z <- sample(c(0, 1), N, replace = TRUE) 

# set the data-generating values
a <- 1
b <- 1:K
theta <- 5
sigma <- 2

# simulate the outcome
y <- a + X %*% b + theta * z + rnorm(N, 0, sigma) 

# save the results in a data frame
fake <- data.frame(
  X = X, 
  y = y, 
  z = z)

They then fit the full model with rstanarm::stan_glm() like so.

fit1 <- rstanarm::stan_glm(y ~ X + z, data = fake)

The resulting summary shows a model fit with an intercept, a z slope, and 10 additional slopes for X1 through X10.

I’m not sure I have the proper language for this, so I’ll apologize up front. But is it possible to use this array-oriented syntax in a formula argument within brm()? My initial naive attempt failed.

# this fails
fit2 <- brms::brm(y ~ X + z, data = fake)

Instead, I get the following error message: “Error: The following variables are missing in ‘data’: ‘X’”

paul.buerkner · January 27, 2021, 6:30am

Add X to the data as a matrix column and you should be good to go.

Solomon · January 27, 2021, 4:10pm

Huh. Thanks @paul.buerkner; it works! For those interested, here’s the updated code:

# first make the data frame without the X matrix
fake <- data.frame(
  y = y, 
  z = z)

# then add in the X matrix by simple assignment
fake$X <- X

You can then check the data structure.

str(fake)

This returns:

'data.frame':	100 obs. of  3 variables:
 $ y: num  34 44.2 39 29.5 40 ...
 $ z: num  0 1 1 0 1 1 1 0 0 0 ...
 $ X: num [1:100, 1:10] 0.705 0.56 0.992 0.213 0.254 ...

It turns out you can do this in one step with tibbles, like so:

fake <- tibble(
  X = X, 
  y = y, 
  z = z)

Anyway, now the brms code yields the intended solution:

fit3 <- brms::brm(y ~ X + z, data = fake)

With the following summary:

 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ X + z 
   Data: fake (Number of observations: 100) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     2.31      1.21    -0.09     4.69 1.00     7661     3260
X1            1.07      0.75    -0.41     2.52 1.00     6643     3322
X2            1.60      0.68     0.27     2.97 1.00     7937     3257
X3            1.46      0.88    -0.25     3.19 1.00     7336     3092
X4            3.22      0.71     1.83     4.62 1.00     6274     3012
X5            5.17      0.69     3.82     6.53 1.00     8376     2685
X6            7.24      0.70     5.85     8.58 1.00     7161     3138
X7            6.09      0.72     4.64     7.53 1.00     7521     3255
X8            7.88      0.78     6.33     9.38 1.00     8062     3187
X9            8.34      0.69     6.97     9.71 1.00     7447     3151
X10          10.75      0.73     9.31    12.20 1.00     6992     3321
z             4.67      0.42     3.85     5.49 1.00     8137     3220

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     1.96      0.15     1.70     2.28 1.00     5989     3049

Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

nerutenbeck · January 27, 2021, 8:06pm

Very cool. Assuming you can do this with a hierarchical specification too? E.G.

brms::brm(y ~ X + (1 + X | group), data = data)

Topic		Replies	Views
How to input matrix data into brms formula (for signal regression using smooths)? Modeling rstan , matrix , r , brms	5	1173	June 29, 2021
How can I use for loop to my brms models? Modeling rstan , specification , brms	7	1293	June 2, 2022
Brms & pk models? brms	22	1793	November 14, 2021
Multivariate model specification Interfaces brms	6	925	May 18, 2021
How to define a array of parameter values brms fitting-issues , splines	1	685	August 24, 2021

Fit a `brm()` model with an array of predictors

Related topics