Simple question on sampling slow-down after centering predictor variables

Myfanwy · October 18, 2017, 6:46pm

Hi there,

I’m a rookie, and my background is in biology, not statistics. I’ve gotten a great deal of help from a friend writing a mark-recapture model that I’m running with the rstan interface. The response variable is probability of survival. I’m starting to add covariates, and experimenting with how various transformations improve or fail to improve model fit. I think the most relevant components of the model to my question are:

data{
  int N;
  int nFish; //n fish ID
  int nStn;
  int fish[N]; //fish ID
  int stn[N];
  int seen[N];
  int last[N];
  int n_release; // number of releases
  int n_releaseCovar; //number of release covariates
  int releaseID[nFish];
  matrix[n_release, n_releaseCovar] releaseCovar; // release-level covariate matrix, each row corresponds to different release, each column is a release-level covariate
}

parameters{
  vector<lower = 0, upper = 1>[nStn] detect;
  vector[nStn] base_survival; // survival is for the reach preceding stn i
  vector[n_releaseCovar] beta_release;
}

transformed parameters{
  vector[nStn] full_survival[nFish]; // full Survival is indexed by fish
  vector[nStn] chi[nFish];

  for(i in 1:nFish){
	full_survival[i] = inv_logit(base_survival + releaseCovar[releaseID[i]]*beta_release); 
	chi[i] = prob_uncaptured(nStn, detect, full_survival[i,2:]);
  }
}

model{
  // Generic priors 
//  target += normal_lpdf(beta_reach | 0, 1);
//  target += normal_lpdf(beta_fish | 0 , 0.5); // broadened prior from 0, 0.2
  target += normal_lpdf(beta_release | 0 , 10); // broadened prior from 0, 0.2
  target += normal_lpdf(base_survival | 0, 0.75);
  target += normal_lpdf(detect | 0.85, 0.35); // added prior for detect
  
  // model
  for (n in 1:N){
	if(stn[n] != 1){
	// at stn #1 p detection and survival = 1
	  target += bernoulli_lpmf(seen[n] | detect[stn[n]]);
	  target += bernoulli_lpmf(1 | full_survival[fish[n], stn[n]]);
	}
	if (last[n])
	  target += bernoulli_lpmf(1| chi[fish[n], stn[n]]);
	
  }

The three covariates I’m including are water discharge (range = 1700-10,000), temperature (range = 10-21), and turbidity (range = 13 - 68). For the latest run of the model, I centered all three variables on 0. The prior on all three betas is target += normal_lpdf(beta_release | 0 , 10);.

The model now samples very, very slugglishly, and I get a couple of these messages before sampling begins:

"Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity. Stan can't start sampling from this initial value."

I got even more of those messages when the prior was normal(0, 1) - when I broadened it to (0, 10), I only get one or two, but the model is still prohibitively inefficient to sample.

I’m sure there’s a well-known statistical reason that centering the predictor variables led to this, but I don’t know what it is - can someone help me out? I’m happy to post the full model code if it would help, but I was thinking it’s probably a conceptual issue and not a code issue.

sakrejda · October 18, 2017, 7:22pm

Given the large and very different variable ranges your main problem may be that you haven’t standardized your covariates. Try standardizing consisted and see what you get.

Myfanwy · October 18, 2017, 7:27pm

Thanks! I have run the same model with standardized covariates, and it does sample much more efficiently, but the interpretation is just a little more intuitive to me with centered variables, so I wanted to try it that way. Could you give a little more detail on why the efficiency changes between standardized vs. centered variables? Is it sometimes the case that centering would result in more efficient sampling than standardizing would?

sakrejda · October 18, 2017, 7:35pm

Internally Stan rescales the parameters using the mass matrix which must be estimated during warmup. If your parameters are all roughly the same scale that process is much faster and more stable.

Don’t confuse the metrics you want your model to produce with the parameters the sampler works with. You can always transform your parameters into whatever metrics you want but the sampler is stuck with whatever you give it. Usually your parameters are not interpretable on their own anyway.

Topic		Replies	Views
Efficient sampling SRM with large number of correlated random effects Modeling	1	748	May 4, 2018
Non-centered parameterization for non-hierarchical parameters? Modeling reparametrization	15	1497	May 6, 2021
Request for help understanding model behavior Modeling	17	1610	March 24, 2018
Looking for a stan program critique Modeling techniques	14	1754	April 17, 2018
Bad RHat values and divergent transitions when sampling from the prior Modeling fitting-issues , specification	11	810	September 22, 2020

Simple question on sampling slow-down after centering predictor variables

Related topics