I’m trying to estimate a model where outcomes are nested within another variable. Specifically, this is a sports model, where the score for each team is nested within the game. So the data looks like this:
#> # A tibble: 2 x 7
#> game score points hc_offense hc_defense offense defense
#> <int> <int> <int> <int> <int> <chr> <chr>
#> 1 1 1 114 1 0 Team A Team B
#> 2 1 2 77 0 1 Team B Team A
So the first row is the points scored by the home team (Team A, with Team B on defense), and the second row is points scores by the away team (Team B, with Team A on defense).
I can estimate this model with nlme as such:
model <- gls(points ~ hc_offense + hc_defense + offense + defense,
correlation = corSymm(form = ~1|game),
weights = varIdent(form = ~1|score))
This allows for the estimation of the correlation between scores within a game. My understanding is that I can estimate the correlation between outcomes by using a multivariate model with brms. However, this requires, I restructure so that one game is on one row, instead of two. This would allow me to specify a multivariate model such as:
bf_home <- bf(home_points ~ hc_offense + offense + defense)
bf_away <- bf(away_points ~ hc_defense + offense + defense)
model <- brm(bf_home + bf_away)
The problem with this is that I can’t define more than one offense/defense for the row:
#> # A tibble: 1 x 7
#> game home_points away_points hc_offense hc_defense offense defense
#> <int> <int> <int> <int> <int> <chr> <chr>
#> 1 1 114 77 1 1 Team A Team B
Team A should the offense for bf_home
, but should be the defense for bf_away
. I could further spread out the data:
#> # A tibble: 1 x 9
#> game home_points away_points hc_offense hc_defense home_pt_offense home_pt_defense away_pt_offense away_pt_defense
#> <int> <int> <int> <int> <int> <chr> <chr> <chr> <chr>
#> 1 1 114 77 1 1 Team A Team B Team B Team A
This would allow me to specify the offense and defense for each score:
bf_home <- bf(home_points ~ hc_offense + home_pt_offense + home_pt_defense)
bf_away <- bf(away_points ~ hc_defense + away_pt_offense + away_pt_defense)
model <- brm(bf_home + bf_away)
However, I then end up with 2 offensive coefficients for each team: one when they are listed as home_pt_offense
and one when listed as away_pt_offense
(the same applies to the defensive coefficients).
I am probably missing something obvious, but what is the best way to define this type of model in brms? I have so far been unable to find a way to estimate this as a multivariate model, but also keep the predictors defined in the correct way.
I’m not sure this is relevant for this particular question, but just in case:
- Operating System: macOS Mojave (10.14.3)
- brms Version: 2.6.0