Count data model with MI basis functions

Saikat94 · August 4, 2019, 6:37am

Hello everyone. I am relatively new and have limited knowledge of STAN. I am trying to work on the following model:
Z|Y~Poisson(exp(Y))
Y=X*\beta+M*\eta
M is the propagator matrix obtained from the MI operator as defined in Hughes and Haran(2013) and \eta are the random effects with mean zero and covariance matrix K.
I wrote the stan code as

data {
int<lower = 1> n_obs;
int<lower = 1> n_cov;
int<lower = 1> r;
int<lower = 0> z[n_obs];
matrix [n_obs,n_cov] x; // the covariate matrix
matrix [r,r] K;
matrix [n_obs,r] M;
vector[r] V;
}
transformed data{
vector[r] zeros;
zeros = rep_vector(0, r);
}
parameters {
vector[n_cov] beta;
vector[r] v;
}
transformed parameters{
vector[n_obs] eta;
eta = M*v;
}
model
{
z ~ poisson_log(x*beta+eta);
beta~ normal(0,10);
v ~ multi_normal(zeros,K);
}

Now in order to run this model, I was using a warmup of 2000 iterations. Then for sampling, a burnin of 20000 and 80000 iterations. The issue here is the code takes forever to run(upwards of 17 hours). Could anyone suggest any modifications to be done in order to speed up the process? I have adjusted the maximum treedepth to 15 and more but the execution time has increased if anything,and the chains aren’t even mixing.
Thank you.

bbbales2 · August 4, 2019, 3:29pm

Since K is data, you can speed this up by doing a Cholesky factorization in transformed data

transformed data {
  matrix[r, r] L = cholesky_decompose(K);
  ...
}

And then you could just make the code you have faster with:

v ~ multi_normal_cholesky(zeros, L);

But it’ll be faster if you make v a transformed parameter based on a new parameter v_z like so:

transformed data {
  matrix[r, r] L = cholesky_decompose(K);
  ...
}

parameters {
  vector[r] v_z;
  ...

}

transformed parameter {
  vector[r] v = L * v_z;
  ...
}

model {
  v_z ~ normal(0, 1);
}

This implies a prior of v \sim N(0, K) and takes less linear algebra. And then you could premultipy M and L together and skip the intermediary v variable (and only have one matrix vector product instead of two)

Saikat94 · August 16, 2019, 10:25pm

@bbbales2 Thank you. I meant to get back earlier. With the modifications you suggested, the process has sped up a whole lot. It takes about 5-6 hours down from 17+ hours, which I am guessing has to do with the size of the model. So, thank you again.

bbbales2 · August 17, 2019, 2:17pm

Cool beans, good to hear. By the way, you can probably use less samples. Running chains of length 10k+ seems excessive.

Try a few chains of default (1000 warmup/1000 sampling) and see if there’s much of a difference in the results of your analysis (and check Neff/Rhat).

Saikat94 · August 21, 2019, 5:48pm

I had initially used 8K iterations with warmup of 2K for a simpler version of the model. But the Chains didn’t mix well. I increased it to over 10K for this model given the fact that the model has more components to it. Since it takes such a long time, I am usually apprehensive of waiting and try with smaller number of iterations and then if the chains didn’t mix, increase the number of iterations. In this model even, depending upon the dimensions of K, the sampling time varies a lot. I was initially working with K being 4x4 matrix and it took around 5-6 hours, but as I have now increased the dimensions(34x34), it is again taking around 15-16 hours.

bbbales2 · August 24, 2019, 8:41am

Oh, interesting. Are your parameters super correlated in the posterior?

If so, have you tried control(metric = "dense")? Sometimes works, sometimes doesn’t.

Topic		Replies	Views
Assessing speed of multilevel model Modeling	1	408	March 25, 2020
Speeding up a multi_normal model Algorithms	4	3044	November 2, 2017
Improving computational and statistical efficiency of survival model Modeling	7	813	July 20, 2018
How to accelerate spatial models on STAN Modeling techniques , fitting-issues , specification	7	913	September 11, 2020
Multivariate normal too slow Modeling performance	7	4198	December 2, 2017

Count data model with MI basis functions

Related topics