Two-part latent variable model, stuck at warm-up

Zhenkai_Ran · June 9, 2020, 3:24am

I am trying to fit a two-part latent variable model in stan, which includes fitting one two-part model and two latent variables. For imputing one of the latent variables - sensation-seeking, I will be using another two-part model, comprised of a Bernoulli and a gamma part. For the other latent variable - financial literacy, I will be using purely Bernoulli manifest variables. Our main two-part model, which depicts my response variable, consists of a Bernoulli part and a log-normal part.

I am well aware of the fact that my model specification is over-complicated, and it takes ages to even run a single iteration in stan. So I am wondering whether there is any way that can help speed up my code. Is the slow iteration mainly caused by Gamma distribution? Any suggestions would be appreciated!

data {
	int<lower=0> n; //number of data items
	int<lower=0> p;//number of predictors (fixed covariates)
	
	real<lower = 0> s_cont[n]; //a 1 X n array of outcome variable
	
	matrix[n,p] x_cov; //fixed covariates matrix
	
	int<lower=0> m_1; //number of manifest variables for sensation seeking 
	int<lower=0> m_2; //number of manifest variables for financial literacy 
	int<lower=0> m; //number of manifest variables 

	real<lower=0> u1_i[n, m_1]; //a m_1 X n array of sensation seeking manifests
	
	int<lower=0> u2_i[n, m_2]; //a m_2 X n array of financia literacy manifests

}

transformed data{
  matrix[n,p] Q;
  matrix[p,p] R;
  matrix[p,p] R_inverse;
  Q = qr_Q(x_cov)[,1:p] * n;
  R = qr_R(x_cov)[1:p,] / n;
  R_inverse = inverse(R);

}

parameters {
	real mu1; // constant term in two-part model's binary model
	real mu2; // constant term in two-part model's continuous model
	
	vector[p] alpha1_tilde; // factor loadings in two-part model's binary model
	vector[p] alpha2_tilde; // factor loadings in two-part model's continuous model
	matrix[p,m_1] alpha3_tilde; //factor loadings for fixed covariates in sensation seeking model's binary part
 	matrix[p,m_1] alpha4_tilde; //factor loadings for fixed covariates in sensation seeking model's continuous part

	real<lower = 0.05> sigma; //std. of error term in two-part model's continuous part
	vector[2] beta1; //latent factor loadings in two-part model's binary part
	vector[2] beta2; //latent factor loadings in two-part model's continuous part
	
  vector[m_1] mu3;
  vector[m_1] mu4;
  
  vector[m_2] mu5;
  
	vector[m_1] lambda_11; //factor loadings for sensation seeking's binary part
	vector[m_1] lambda_12; //factor loadings for sensation seeking's continuous part
	
  vector<lower=0.05>[m_1] u1_shape; //shape parameter for gamma distribution
	
	vector[m_2] lambda_2; //factor loadings for financial literacy
	
	vector[n] omega1_i; //sensation seeking indexes
	vector[n] omega2_i; //financial literacy indexes
	
	vector<lower=0.05>[2] tau; //covariance matrix of latent variables

}


transformed parameters {  
  
  vector[p] alpha1 = R_inverse * alpha1_tilde; // factor loadings in two-part model's binary part
	vector[p] alpha2 = R_inverse * alpha2_tilde; // factor loadings in two-part model's continuous part
	
	matrix[p,m_1]  alpha3 =  R_inverse * alpha3_tilde ; //factor loadings for fixed covariates in our sensation seeking model's binary part
	matrix[p,m_1]  alpha4 =  R_inverse * alpha4_tilde ; //factor loadings for fixed covariates in sensation seeking model's continuous part

  matrix[2,n] omega_i; //latent factors
  
  vector[n] x_beta1;
	vector[n] x_beta2;

  vector[m_1] lambda1_fix;
  vector[m_2] lambda2_fix;
  
  vector[n] u_temp11[m_1];
  vector[n] u_temp12[m_1];

  
    lambda1_fix[1] = 1;
    lambda2_fix[1] = 1;
    
     for(i in 1:m_1-1){
    lambda1_fix[i+1] = lambda_11[i+1];
  }
    
  for(i in 1:m_2-1){
    lambda2_fix[i+1] = lambda_2[i+1];
  }
 
 omega_i = [omega1_i', omega2_i'];

	x_beta1 = mu1 + Q * alpha1_tilde + (omega_i') * beta1; // X*B_1: n X 1
	x_beta2 = mu2 + Q * alpha2_tilde + (omega_i') * beta2; // X*B_2: n X 1
	

 	for (i in 1:m_1) {
	u_temp11[i] = mu3[i] +  Q * alpha3_tilde[,i]  + omega1_i *  lambda1_fix[i]; //X*B_3: n X 1: overall, (m1*n) X 1
  u_temp12[i] = mu4[i] +  Q * alpha4_tilde[,i]  + omega1_i *  lambda_12[i];
	}
	
	
}



model{
	vector[m_2] u_temp2[n]; // logit(Prob. of success) 
	
	
 for(i in 1:n){
 	omega_i[,i] ~ normal(0, tau); 
 }	

 	   for(i in 1:n) {
	for (j in 1:m_1) {
		if (u1_i[i,j]==0)
			target += bernoulli_logit_lpmf(0|u_temp11[j,i]);
		else
			target += bernoulli_logit_lpmf(1|u_temp11[j,i]) 
			   + gamma_lpdf( u1_i[i,j] |u1_shape[j],u1_shape[j]/exp(u_temp12[j,i])); 
 }
}


	for(i in 1:n){
		u_temp2[i] = mu5  + lambda2_fix*(omega2_i[i]);
	}
	
	
	for(i in 1:n){
		for(j in 1:m_2){
		target += bernoulli_logit_lpmf(u2_i[i,j]| u_temp2[i,j]);
		}
	}
	
    for(i in 1:n){
		if(s_cont[i]==0)
			target += bernoulli_logit_lpmf(0|x_beta1[i]);
		else
			target += bernoulli_logit_lpmf(1|x_beta1[i]) 
			+ lognormal_lpdf(s_cont[i]|x_beta2[i],sigma);
}


	mu1 ~ normal(0,2);
	mu2 ~ normal(0,2);
	sigma ~ inv_gamma(0.01,0.01);
	
	
alpha1 ~ normal(0,5); 
alpha2 ~ normal(0,5);
beta1 ~ normal(0,5);
beta2 ~ normal(0,5);

	tau ~ inv_gamma(0.01,0.01);
	
	mu3 ~ normal(0,2);
	mu4 ~ normal(0,2);
	mu5 ~ normal(0,2);

	lambda_11  ~ normal(0,5);
	lambda_12  ~ normal(0,5);
	lambda_2  ~ normal(0,5);
	
	to_vector(alpha3) ~ normal(0,5);
        to_vector(alpha4) ~ normal(0,5);
 
	to_vector(u1_shape) ~ gamma(0.01,0.01);
}

bbbales2 · June 9, 2020, 12:51pm

In a complicated model like this, I’d be more concerned with non-identifiabilities.

Since you have a lot of variables, I think the only way to quickly figure out if these are problems are:

Check the treedepth for a model that has run a short while, if it is like 9-10, or you’re getting treedepth warnings, there is something there
Plot a posterior correlation matrix and search for large correlations between variables. If you’re using rstan, do something like as.matrix(fit) to get your posterior draws out as a matrix you can compute correlations on.

Once you have this then your options for speeding things up are either reparameterizing or tightening priors. Neither is particularly easy or guaranteed to work :D.

Zhenkai_Ran · June 9, 2020, 1:06pm

Thank you very much for your reply! You are perfectly right. I have run the model for a small subset (1000 obs) of my original database, I’ve got almost 600 iterations exceeding the maximum treedepth of 10. I think this could indicate some identifiability issues wired in my model. Rhat also seems quite high. What kind of reparameterization would you suggest? Does simplifying some of the model distribution help, for example, through switching from the gamme distribution to exponential distribution? Or, should I simply remove one latent variable? Any comment is highly appreciated!

bbbales2 · June 9, 2020, 1:09pm

I don’t have great advice on factor models. They’re hard cause the constraints and possible non-identifiabilities. Search the forums, is the best generic advice I can give you.

I don’t think that will change much, though it sounds easy to try.

I’d start by looking at the posterior correlations and then dig around the forums.

Zhenkai_Ran · June 9, 2020, 1:12pm

Sure. I’ll search around the forum for other relevant posts. Many thanks!

bbbales2 · June 9, 2020, 1:13pm

For what it’s worth, I don’t expect there to be a simple solution to the problem somewhere, but you might get ideas, and comboing that with what you find in your correlations you might get a solution :D.

Zhenkai_Ran · June 9, 2020, 1:22pm

Sure. One more question: for parameters I find large correlations, should I completely drop them? I have actually implemented an QR reparameterization in my model, I suppose this will alleviate to some extent the high-correlation issue? Many thanks.

bbbales2 · June 9, 2020, 1:30pm

Depends on what you find.

You’ll probably need to zoom in and make pairplots of things that are correlated to identify if the correlations are linear or non-linear.

If they’re linear, then there is more hope that there is a solution (and maybe the correlations can be interpreted as something specific).

It’s possible the correlations aren’t a problem in a reparameterization (like the QR – or it’s possible if there’s a mistake with the QR it’s causing them haha).

If it’s a non-identifiability where dropping one variable out would help, then I guess dropping variables makes sense. It’s definitely reasonable a model might have a sum to zero constraint (which in the end can be implemented with tight priors [soft sum to zero] or dropping out a variable [hard sum to zero]).

Zhenkai_Ran · June 9, 2020, 3:27pm

Thank you very much for your comment. This is very very helpful!

Topic		Replies	Views
Vectorization and improving speed for a complex mixture type latent trait model for response times Modeling	11	726	January 16, 2025
Problematic Sampling with Model Modeling fitting-issues , specification	7	718	October 4, 2019
Requesting a suggestion for a faster implementation of a model involving mixture distribution General	5	780	July 27, 2021
Gamma regression in Stan vs Frequentist approach Modeling specification	11	3229	March 26, 2024
Slow run time when fitting multi_normal_cholesky likelihood in bivariate model Modeling rstan , fitting-issues , specification , performance	3	286	May 8, 2024

Two-part latent variable model, stuck at warm-up

Related topics