Migrated from Google Group: AR(1) Logistic Regression Funnel

I just noticed the move to Discourse. Ben Goodrich was kind enough to respond to me there initially, but I figure it’s good etiquette to use the new forum.


I’m working on a model to estimate a time series of binomial probabilities that I suspect have some level of temporal autocorrelation. The real world application of this model is to estimate “transport probabilities” for juvenile salmon passing a hydroelectric project on the Columbia river. As fish migrate down the river the pass through the hydroelectric project. Some proportion of the population is collected and of those fish collected, a proportion are diverted to holding facility where they are loaded onto barges and trucks and transported down river; the remaining fish are returned to the river to migrate downstream ‘naturally’.

We’re interested in estimating these daily transport probabilities based on a subset of fish that are marked with electronic tags that allow the fate of individually tagged fish to be determined. A feature of the dataset is that the N’s - the number of fish available to be either transported or returned to the river displays a pattern where few tagged fish are available at the beginning and end of the season with sizeable numbers in the middle of the season.

I’ve used the code snippet Ben Goodrich provided here for the AR(1) model for parameters as a basis for the following Stan code:

  data {
  int<lower = 1> T;    //number of days
  int<lower = 0> N[T];
  int<lower = 0> y[T];
}

parameters {
  real mu;
  real eps_0;
  vector[T - 1] eps_raw;
  real<lower = -1, upper = 1> phi;
  real<lower=0> sigma;
}

transformed parameters{
  vector[T] eps;
  real<lower = 0, upper = 1>  p[T];

  eps[1] = eps_0;
  for (t in 2:T) 
    eps[t] = phi * eps[t - 1] + sigma * eps_raw[t - 1];

  for (t in 1:T)
    p[t] = inv_logit(mu + eps[t]);
}

model {
  eps_0 ~ normal(0, sigma * sqrt(1 - square(phi)));
  eps_raw ~ normal(0, 1);

  for (t in 1:T)
    y[t] ~ binomial(N[t], p[t]);
}

The issue I’m seeing is that I get a large number of divergent transitions. In my applied data example, I can’t eliminate all these divergent transitions even with increasing adapt_delta to 0.9999. I’ve been able to duplicate the behavior I’m seeing in a simulated data example. What I’ve noticed in both my simulated and observed data is a very distinctive funnel shape in the pairs plot of the regression intercept and the autoregressive term.

Simulated data:

Times = 150

N_1 = rpois(Times * 0.3, lambda = exp(rnorm(Times * 0.3, 4, 2)))
N_2 = rpois(Times * 0.7, lambda = exp(rnorm(Times * 0.7, 4, 2)))

N = c(N_1[order(N_1)], N_2[order(N_2, decreasing = T)])

mu = qlogis(2/3)
sigma = 1/4
eps = arima.sim(model = list(ar = 0.9), 
                n = Times, 
                sd = sigma)
p = plogis(mu + eps)

y = rbinom(Times, N, p)

ar1_check = stan(data = list(T = Times,
                 N = N,
                 y = y),
     model_code = ar1_logit,
     control = list(adapt_delta = 0.99)
     )

pairs(ar1_check, pars = c("mu", "sigma", "phi", "eps_0"))

And pairs plot:

The funnel is even more dramatic in my applied example data:

N = c(0, 0, 24, 62, 104, 145, 124, 109, 111, 123, 235, 477, 578, 
      853, 903, 888, 734, 989, 1010, 1262, 1438, 1431, 952, 1914, 3470, 
      2014, 1197, 912, 744, 476, 412, 200, 219, 205, 277, 263, 97, 
      95, 110, 73, 25, 77, 87, 64, 35, 20, 6, 16, 19, 15, 15, 20, 26, 
      10, 9, 5, 3, 2, 3, 0, 3, 0, 1, 2, 2, 2, 5, 3, 13, 9, 5, 2, 1, 
      2, 1, 3, 3, 3, 1, 1, 1, 2, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 
      0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 
      0, 0, 0, 0, 0, 0, 0)

y = c(0, 0, 18, 49, 81, 114, 97, 83, 82, 96, 183, 352, 441, 646, 
  662, 686, 563, 765, 780, 986, 1122, 1129, 750, 1517, 2699, 1570, 
  945, 722, 571, 372, 319, 160, 175, 160, 231, 206, 78, 77, 91, 
  60, 19, 63, 69, 51, 27, 16, 5, 13, 17, 13, 11, 15, 20, 9, 7, 
  4, 2, 2, 3, 0, 1, 0, 1, 2, 2, 1, 4, 3, 11, 7, 4, 2, 0, 2, 0, 
  2, 3, 1, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
  0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 
  0, 0, 0, 0, 0)`


ar1_check = stan(data = list(T = length(y),
                         N = N,
                         y = y),
             model_code = ar1_logit,
             control = list(adapt_delta = 0.9999))

I suspect the issue becomes more apparent with smaller sigmas (in my applied example sigma is estimated as 0.05). One obvious answer here is to abandon the autoregressive term for this applied example, but this is not the first time I’ve seen this when trying to do an AR(1) logistic regression and so I’m curious as to how general this behavior is and whether there is a reparameterization of the AR(1) logistic regression that might avoid this.

Ben Goodrich suggested that I “Make the intercept (alpha) be equal to the unconditional mean of the log-odds (mu) times 1 minus phi.”

I changed my code to match that advice, but had even more divergent transitions and noticed that mu blows up for large values of phi:

  data {
  int<lower = 1> T;    //number of days
  int<lower = 0> N[T];
  int<lower = 0> y[T];
}

parameters {
  real mu;
  real eps_0;
  vector[T - 1] eps_raw;
  real<lower = -1, upper = 1> phi;
  real<lower=0> sigma;
}

transformed parameters{
  real alpha;
  vector[T] eps;
  real<lower = 0, upper = 1>  p[T];

  alpha  = mu * (1 - phi);
  eps[1] = eps_0;
  for (t in 2:T) 
    eps[t] = phi * eps[t - 1] + sigma * eps_raw[t - 1];

  for (t in 1:T)
    p[t] = inv_logit(alpha + eps[t]);
}

model {
  eps_0 ~ normal(0, sigma * sqrt(1 - square(phi)));
  eps_raw ~ normal(0, 1);

  for (t in 1:T)
    y[t] ~ binomial(N[t], p[t]);
  }

Maybe I misinterpreted Ben’s advice? Other suggestions on how to implement logistic autoregression?

I’m not sure exactly what @bgoodri was suggesting, but you might also try adding some priors for mu and sigma to the model.

The advice we usually give people is to try to simulate data from the model and see if it fits. If so, the problem is usually misspecification—the data doesn’t match the model. In which case, you want to figure out why.

P.S. I wonder if those Columbia river salmon know just how much stats they motivate!

Thanks @Bob_Carpenter.

I have simulated the data from the model which is the second chunk of code. What motivated my post was that with this simple simulation I was able to replicate the behavior I’ve seen a handful of times now in my applied modeling. I’ve got priors in my applied models and I’ve added mu ~ normal(0, 10); sigma ~ normal(0, 0.5); as priors to the model for the simulations but I’m still seeing the funnel and some divergent transitions.

My simulations have given me confidence that I’m able to retrieve the parameters I care about (daily transport probabilities), but the funnel shape reminded me of Neal’s funnel and I was wondering if there was an easy reparameterization folks here might have already through of. I think the issue is more one of computational time and being sure that I can eliminate divergent transitions than worries about my inference.

As far as those Columbia River Salmon, having just returned from a meeting with 20+ quantitative fisheries biologist, I can tell you that while they’re probably not aware of it, these are the fish that spawned a thousand PhDs. (Fisheries pun intended).

I think you have to move the posterior distribution of phi away from the boundary. Or maybe have the prior variance of mu depend on 1 - square(phi). Basically, it is telling you that any value of mu is possible when phi = 1.

It’s hard to say without seeing exactly where the divergences are occurring. The fact that you see funnel like behavior in the mu/phi and eps_0/phi/sigma indicates that the problem is particularly subtle. The prior for eps_0 certainly looks like it can cause trouble.