Test: Soft vs Hard sum-to-zero constrain + choosing the right prior for soft constrain

mitzimorris · April 27, 2018, 3:05pm

I did some tests on the scale of the soft-centering using an ICAR prior and a simple Poisson model:

functions {
 void icar_normal_lp(int N, int[] node1, int[] node2, real s, vector phi) {
   target += -0.5 * dot_self(phi[node1] - phi[node2]);
  // soft sum-to-zero constraint on phi
  // more efficient than mean(phi) ~ normal(0, s)
  sum(phi) ~ normal(0, s * N);
 }
}
data {
  real<lower=0, upper=0.1> s;  // scale close to zero
  int<lower=0> N;
  int<lower=0> N_edges;
  int<lower=1, upper=N> node1[N_edges];  // node1[i] adjacent to node2[i]
  int<lower=1, upper=N> node2[N_edges];  // and node1[i] < node2[i]

  int<lower=0> y[N];              // count outcomes
  vector<lower=0>[N] E;           // exposure
}
transformed data {
  vector[N] log_E = log(E);
}
parameters {
  real beta0;             // intercept
  real<lower=0> sigma;    // overall standard deviation
  vector[N] phi;         // spatial effects
}
model {
  y ~ poisson_log(log_E + beta0 + phi * sigma);
  beta0 ~ normal(0.0, 1.0);
  sigma ~ normal(0.0, 1.0);
  icar_normal_lp(N, node1, node2, s, phi);

}

I ran this over the NYC pedestrian traffic data used in the ICAR case study, and tried scale of 0.1, 0.01, and 0.001, 3 chains, 2000 iterations (default). when the scale was 0.1 the Rhat values indicated failure to converge, however there were no divergences or other warnings, only Rhat values above 1.1 (close but no cigar Rhats).
because the scale 0.1 didn’t really converge well, sampling took much longer, and the overall warmup times seemed longer too. there wasn’t much difference between scale 0.01 and 0.001 - here’s the times for these latter two:

NYC:  1921 regions, intercept only, pois + icar
s = 0.01
3 parallel chains:
 Elapsed Time: 505.259 seconds (Warm-up)
 Elapsed Time: 555.046 seconds (Warm-up)
 Elapsed Time: 575.119 seconds (Warm-up)
               262.570 seconds (Sampling)
               250.358 seconds (Sampling)
               243.568 seconds (Sampling)
               767.829 seconds (Total)
               805.404 seconds (Total)
               818.687 seconds (Total)

****************

s = 0.001
3 parallel chains:
 Elapsed Time: 528.069 seconds (Warm-up)
 Elapsed Time: 543.13 seconds (Warm-up)
 Elapsed Time: 556.438 seconds (Warm-up)
               273.678 seconds (Sampling)
               268.504 seconds (Sampling)
               265.584 seconds (Sampling)
               801.747 seconds (Total)
               811.634 seconds (Total)
               822.021 seconds (Total)

I reran the model with scale 0.1 several times - there the warmup took on the order of 600 - 700 seconds, as did the sampling - given that there were 1000 iterations in both sampling and warmups, the fact that sampling iterations took as long as warmups is also an indication of failure to converge.

Topic		Replies	Views
Inequality constraints on linear combinations of parameters Modeling specification , priors	21	314	March 23, 2025
New Stan data type: zero_sum_vector Developers specification	23	3721	April 26, 2022
Penalizing parameter to enforce a sum to 0 constraint? Modeling	13	502	May 9, 2024
Does the soft-constraint converge to rigid-constraint? Modeling	5	1749	October 22, 2018
Sum to zero constraints and multi-level models - best practise/example code Modeling	12	399	May 27, 2025

Test: Soft vs Hard sum-to-zero constrain + choosing the right prior for soft constrain

Related topics