Using the same linear predictor for multiple responses in BRMS


#1

I am wondering how best to code using the same linear predictor in multiple responses, in BRMS. As a simple example, imagine we are counting alive birds and dead birds 🐦. The mean total number of birds depends on some covariates:
\mu = \sum_i \beta_i x_i

The number of alive birds is:
b_a \sim \mathrm{Poisson}(p_a \mu)

The number of dead birds is:
b_d \sim \mathrm{Poisson}((1-p_a) \mu)

I have read about coding multiple responses in BRMS but am looking for an example of sharing the same linear predictor between the multiple responses in this way. Any pointers welcome 🙂


#2

I think calling two likelihoods would work:

vector[N] mu = X * beta; // beta as vector and X as matrix

b_a ~ poisson(p_a * mu);
b_d ~ poisson((1-p_a) * mu);

Explicitly, I think what would happen is this:

\mu = \bf{X} \beta (Using matrix notation, so \mu is a vector, with entries i to N)

p(b_a | \lambda=p_a \mu) = \frac{(p_a \mu)^{b_a} e^{-p_a \mu}}{b_a !}

p(b_d | \lambda=(1-p_a) \mu) = \frac{((1-p_a) \mu)^{b_d} e^{-(1-p_a) \mu}}{b_a !}

Bringing over to log probability, which is what Stan works with:

log~p(b_a) = b_a log(p_a \mu) - p_a \mu - log(b_a !)

log~p(b_d) = b_d log((1-p_a) \mu) - (1-p_a) \mu - log(b_d !)

Adding the two together and dropping additive constants that don’t depend on parameters, we have:

log~p(b_a, b_d) = log[p_a^{b_a} (1-p_a)^{b_d} \mu ^{b_a+b_d}] - \mu

So explicitly I think you could call the joint likelihood as:

target += log(p_a^b_a * (1-p_a)^b_d * mu^(b_a + b_d)) - mu;

And it might help to pre-calculate some of this stuff


#3

… Thanks, have edited my question to make it clear that I am trying to figure out if this is possible within BRMS. As you point out, this is straightforward to code up in Stan, but my actual problem is quite a bit more complex. If possible, I would prefer to stay within BRMS, so I can take advantage of all the other great features of BRMS.


#4

Sorry, my reply got sent prematurely.


#5

Okay, so judging from this vignette on modelling multiple response variables and this vignette on specifying a nonlinear model it seems like this should be straightforward:

prior1 <- prior(normal(0, 1), nlpar = "b1") + prior(normal(0, 1), nlpar = "b2") + ...

ba_bf <- bf(b_a ~ (b1*predictor_1 + ... + bk*predictor_k) * p_a)
bd_bf <- bf(b_d ~ (b1*predictor_1 + ... + bk*predictor_k) * (1-p_a))

fit <- brm(
ba_bf + bd_bf,
  data = bird_data, chains = 2, cores = 2, family = poisson(), nl = TRUE, , prior = prior1
)

I think the trick might be to use nl = TRUE to specify some of the shared structure with p_a and (1-p_a). Might want to look a little more closely about how to tell brm() which variables are data and which are parameters


#6

Just for clarity, here is complete example Stan code:

data {
        int<lower=1> N;  // total number of observations
        int alive[N];  // alive birds
        int dead[N];  // dead birds
        vector[N] x;  // environmental variable
}

parameters {
        real beta0;
        real beta1;
        real<lower=0, upper=1> p; 
}

transformed parameters {
        vector[N] mu;
        mu = exp(beta0 + beta1*x);
}

model {
        beta0 ~ normal(0, 3);
        beta1 ~ normal(0, 3);
        p ~ uniform(0, 1);

        alive ~ poisson(p * mu);
        dead ~ poisson((1 - p) * mu);
}


If this is saved in multiple-responses.stan, then it works with the following rstan code:

library(rstan)
set.seed(2546)
N = 100 
x=rnorm(N, 0, 1)
bird_data = list(x=x,
        N=N,
        alive=rpois(N, lambda=0.1*exp(x)),
        dead=rpois(N, lambda=0.2*exp(x))
)

fit <- stan(file='multiple-responses.stan', data=bird_data)

I tried @ScottAlder’s suggestion, but it threw errors, and I haven’t yet figured out how to make a model like this within BRMS.


#7

Sorry! I haven’t quite been able to pick my way through the non-linear and multiple response documentation to get a solution that works. The first example I have found where reasoning in Stan has been easy, but reasoning in BRMS, not so much!


#8

Sorry for jumping in so late. brms does currently not allow for shared predictor terms in a multivariate because of some decision choices I regret making initially. This will be changed with brms 3.0, though.

Since both of your models are poisson, I still have a solution for you. Write the data in long format, that is both responses b_a and b_d under each other as the same variable. Let’s call it b. Define a variable z that indicates if the observations belong to b_a and b_d. The full specification then looks as follows.

library(brms)

df <- data.frame(b = 1:100, predictor_1 = rnorm(100), z = rep(0:1, each = 50))

form <- bf(
  b ~ eta * p^z * (1 - p)^(1-z), 
  eta ~ predictor_1 + ..., 
  p ~ 1,
  nl = TRUE
)

bprior <- prior(normal(0, 5), nlpar = eta) +
  prior(normal(0, 5), nlpar = p, lb = 0, ub = 1)
make_stancode(form, df, poisson(), prior = bprior)

#9

Thanks for clarifying 👍 and for the example. Look forward to BRMS 3.0 ♥️