Multivariate Poisson

Hey!
I am having some issues in estimating a multivariate Poisson model.
Suppose that I have a matrix of outcomes Y and each column of this matrix correspond to a different outcome. On the same verge, I have a matrix of predictors X and a matrix of coefficients.
This is a “working” example, in the sense that compiles, but the results have no sense at all.
If I look at traceplots they stand immobile throughout all the iterations, not exploring the likelihood. ESS is really low, and transitions are divergent.
I’d not model jointly every outcome, but I think I have to do so, as I should impose a Gaussian Process prior on beta on a later stage.
Any suggestion is really appreciated :)

Here I leave the model:

data {
  int<lower=1> n_c;     // Numero di predittori per ogni unità
  int<lower=1> n_t;     // Numero di predittori per ogni unità
  int<lower=0> t0;      // Numero di osservazioni per ogni unità, tempi pre-trattamento
  int<lower=0> tt;      
  int<lower=0> y[t0, n_t];     // Matrice delle osservazioni di conteggio  matrix[t0, n_c] x;    // Matrice dei predittori, tempi pre-trattamento
  matrix[tt, n_c] x_pred;    // Matrice dei predittori, tempi pre-trattamento
  matrix[t0, n_c] x;    // Matrice dei predittori, tempi pre-trattamento
  real h[n_t];           //////// vector of relative distances
  real area[n_t]; //////// offset for the area total
}
parameters {
  matrix[n_c, n_t] beta;    // Matrice dei coefficienti di regressione
  vector[n_t] alpha;        // Vettore dei termini di intercettazione
  real <lower=0> rho_b;
  real <lower=0> alpha_b;
  real <lower=0> eta_b;
}

model {
   // Priori
   for(ii in 1:n_c)
  beta[ii,] ~ normal(0, 1);    // Priori normali per i coefficienti
  alpha ~ normal(0, 1);   // Priori normali per gli intercettazioni
  for (i in 1:t0) {
    y[i] ~ poisson_log(x[i] * beta);
  }
}

Hello!

Looking at the model, I am wondering if

y[i] ~ poisson_log(x[i] * beta);

should be

y[i] ~ poisson_log(alpa + x[i] * beta);

What do you think?

Well, in the code I’ve posted there is no intercept, I have tried with and without it and it does not solve the issue :)

I see.
What about applying normalization to x? Have you tried it? It would also be interesting to see the results of the prior predictive check!

There are a lot of issues in the code so let’s tackle them one by one.

these three parameters are in the parameter block, so they are being sampled, but you do not use them anywhere and they don’t influece the log likelihood. This will cause substantial problems as there is nothing for the sampler to explore - the likelihood is constant with respect to them.

The alpha parameter is also not doing anything - you are sampling from its prior, but it is not connected to the model of the response in any way.

Next, there’s a bunch of variables in the data block that you don’t use, so let’s get rid of them to make the code easier to understand and debug.

There are also missing parantheses around the for loop for the prior on beta.

The int y[t0, n_t] syntax is no longer supported in the latest version of stan, so we need to use array syntax.

So now we are down to:

data {
  int<lower=1> n_c;     // Numero di predittori per ogni unità
  int<lower=1> n_t;     // Numero di predittori per ogni unità
  int<lower=0> t0;      // Numero di osservazioni per ogni unità, tempi pre-trattamento
  array[t0, n_t] int y;     // Matrice delle osservazioni di conteggio
  matrix[t0, n_c] x;    // Matrice dei predittori, tempi pre-trattamento
}
parameters {
  matrix[n_c, n_t] beta;    // Matrice dei coefficienti di regressione
}
model {
   // Priori
   for(ii in 1:n_c) {
      beta[ii,] ~ normal(0, 1);    // Priori normali per i coefficienti
   }
   for (i in 1:t0) {
     y[i] ~ poisson_log(x[i] * beta);
   }
}

This now runs fine for me with a simulated dataset:

library(cmdstanr)
mod <- cmdstan_model(write_stan_file(stancode))    # the code above is saved in a stancode variable

# simulate data
N = 1000
data = list(n_c = 3, n_t = 2, t0 = N)
data$x <- matrix(rnorm(N*3), N, 3)
bs <- matrix(c(0.3, 0.6, 0.9,-0.3,-0.6,-0.9), nrow=3)
data$y <- matrix(rpois(N*2, exp(data$x %*% bs)), N, 2)

# estimate the model
fit <- mod$sample(data, parallel_chains = 4)
fit
  variable   mean median   sd  mad     q5    q95 rhat ess_bulk ess_tail
 lp__      968.05 968.39 1.74 1.60 964.73 970.27 1.00     1827     2355
 beta[1,1]   0.26   0.26 0.02 0.02   0.22   0.30 1.00     5630     3211
 beta[2,1]   0.60   0.60 0.02 0.02   0.56   0.63 1.00     5270     3145
 beta[3,1]   0.89   0.89 0.02 0.02   0.86   0.92 1.00     5656     2965
 beta[1,2]  -0.24  -0.25 0.02 0.02  -0.28  -0.21 1.00     5197     3217
 beta[2,2]  -0.62  -0.62 0.02 0.02  -0.66  -0.59 1.00     4817     2987
 beta[3,2]  -0.91  -0.91 0.02 0.02  -0.94  -0.88 1.00     4711     3203

The code can be optimized further. You don’t have to loop over the observations, it would be much easiler to loop over the response variables, and that way you can combine the stamtement for both beta’s prior and y in a single much shorter loop. We can also use the built in function poisson_log_glm() which is much more efficient for linear formula syntax:

data {
  int<lower=1> n_c;    // number of predictors
  int<lower=1> n_t;    // number of dependent variables
  int<lower=0> t0;      // number of observations   
  array[t0, n_t] int y; // outcome  
  matrix[t0, n_c] x;    // predictors
}
parameters {
  matrix[n_c, n_t] beta;   // regression coefficients
}
model {
  for (i in 1:n_t) {
    beta[,i] ~ normal(0,1);
    y[,i] ~ poisson_log_glm(x, 0, beta[,i]);
  }
}

this code runs in 0.5 seconds, while the code above runs in ~4 seconds. The results are the same:

  variable   mean median   sd  mad     q5    q95 rhat ess_bulk ess_tail
 lp__      968.08 968.43 1.77 1.65 964.70 970.30 1.00     1821     2648
 beta[1,1]   0.26   0.26 0.02 0.02   0.22   0.30 1.00     5554     3246
 beta[2,1]   0.60   0.60 0.02 0.02   0.56   0.63 1.00     4856     3550
 beta[3,1]   0.89   0.89 0.02 0.02   0.86   0.92 1.00     5527     3194
 beta[1,2]  -0.24  -0.24 0.02 0.02  -0.28  -0.21 1.00     4807     2968
 beta[2,2]  -0.62  -0.62 0.02 0.02  -0.65  -0.58 1.00     5572     2894
 beta[3,2]  -0.91  -0.91 0.02 0.02  -0.94  -0.88 1.00     5611     3100

Hope this helps. In general, try to start simple. Don’t build complexity into your model until you can run a simple version and get what you expect. Start adding parameters and conditions only after you ensure that a model works well in its simple version.

1 Like