Casual question about convergence

stemangiola · November 7, 2017, 3:05am

Hello Community,

I have always been curious of why sometimes the chains tend to converge to a different value close to the end of the “warmup” phase. Sometimes do a substantial jump.

In this case, for example I have this

The chains seem to mix decently and suddently change behavious when the sampling time is approaching.

P.S. Although this is a general question, FYI the (work in progress) model is the following.

data{
	int G;                                                           // Number of marker genes
	int P;                                                           // Number of cell types
	int S;                                                           // Number of mix samples 
	int<lower=1> R;                                                           // Number of covariates (e.g., treatments)
  matrix[S,R] X;                                                    // Array of covariates
	matrix<lower=0>[S,G] y;                                                  // Mix samples matrix
	matrix<lower=0>[G,P] x;    
	
	// Background cell types
	vector<lower=0, upper=1>[S] p_ancestor;
	matrix[S,G] y_bg_hat;    
}
transformed data{
	matrix<lower=0>[S,G] y_log;           // Mix samples matrix     

	y_log = log(y+1);

}
parameters {
	simplex[P] beta[S]; // coefficients for predictors
	real<lower=0> sigma; // error scale
	matrix[R,P] alpha;    // Prior to a
	vector<lower=0.1>[P] phi;
}
transformed parameters{
	matrix[S,P] beta_adj;
	matrix[S,G] y_hat; 

	
	for(s in 1:S) beta_adj[s] = to_row_vector(beta[s]) * p_ancestor[s];
	y_hat = beta_adj * x';
	y_hat = y_hat + y_bg_hat;
}
model {

	matrix[S,P] beta_hat;

	// Regression
	sigma ~ normal(0, 0.1);
		
	// Dirichlet prior on proportions
	alpha[1] ~ normal(0, 1);
	if(R>1) to_vector(alpha[2:R]) ~ normal(0,1);
	phi ~ normal(0.1, 5); #phi ~ normal(phi_prior[1], phi_prior[2]);
	
	// Regression
  to_vector(y_log) ~ student_t(8, log(to_vector(y_hat)), sigma);   
  
  // Hypothesis testing
	beta_hat = X * alpha;
	for(s in 1:S) for(p in 1:P) beta_hat[s,p] = inv_logit(beta_hat[s,p]);
	
	for(s in 1:S) beta[s] * mean(p_ancestor) ~ beta(beta_hat[s] .* to_row_vector(phi), (1 - beta_hat[s]) .* to_row_vector(phi));  

}

sakrejda · November 7, 2017, 7:04pm

You might want to look at this in the context of what happens during the various adaptation windows. I think what you’re seeing is the final stepsize adaptation phase where the sampler might end up further away from where it previously settled as larger stepsizes are tried out. The phases are described in the manual and the default values are described in the CmdStan doc and probably the rstan doc.

Topic		Replies	Views
One chain exploring a different region General	10	678	April 6, 2022
Troubleshooting chain mixing Modeling	2	604	June 16, 2021
Chains not mixing for hierarchal case Modeling fitting-issues , specification	10	2171	July 12, 2019
Why is one parameter not sampling at all? Modeling	1	313	February 7, 2020
Warning: 253 divergent transitions after warmup Modeling	2	421	March 25, 2022

Casual question about convergence

Related Topics