Order of magnitude variation in ESS

I’m wondering if anyone has any tips on how to improve model sampling so that all parameters have similar efficiencies (i.e. ESS values).

> summary(model2)
 Family: bernoulli 
  Links: mu = identity 
Formula: index ~ inv_logit(int + slope * Phi) 
         int ~ 0 + category
         slope ~ 0 + category
         Phi ~ 0 + g_id
   Data: sdata_2c (Number of observations: 80502) 
  Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
         total post-warmup draws = 8000

Regression Coefficients:
                 Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
int_category1        0.26      0.03     0.20     0.31 1.00     9329     5800
int_category2        0.30      0.07     0.16     0.42 1.00     2869     3545
int_category3       -0.05      0.04    -0.11     0.02 1.00     5775     5344
int_category4       -0.22      0.04    -0.30    -0.15 1.00     5331     4111
int_category5        0.31      0.04     0.23     0.40 1.00     3657     3771
int_category6       -0.94      0.05    -1.03    -0.85 1.00     3466     4421
int_category7        0.33      0.05     0.24     0.42 1.00     3388     3927
int_category8       -0.11      0.03    -0.18    -0.04 1.00     7077     5611
int_category9       -0.10      0.05    -0.20    -0.01 1.00     3132     4040
int_category10      -0.41      0.03    -0.48    -0.35 1.00     6692     5429
slope_category1     -0.13      0.02    -0.17    -0.10 1.02      228      573
slope_category2      2.43      0.32     1.85     3.11 1.02      220      477
slope_category3      0.46      0.06     0.35     0.60 1.02      206      460
slope_category4      0.62      0.08     0.47     0.79 1.02      205      483
slope_category5     -1.02      0.14    -1.30    -0.78 1.02      202      465
slope_category6      1.05      0.14     0.80     1.34 1.02      198      439
slope_category7     -1.23      0.16    -1.56    -0.94 1.02      199      424
slope_category8     -0.40      0.05    -0.51    -0.30 1.02      211      453
slope_category9      1.33      0.18     1.02     1.70 1.02      207      478
slope_category10     0.35      0.05     0.26     0.45 1.02      209      451
Phi_g_id1            0.12      0.08     0.01     0.30 1.00     4512     3352
Phi_g_id2            1.67      0.25     1.23     2.22 1.02      282      666
Phi_g_id3            0.23      0.11     0.04     0.45 1.00     2096     2230
Phi_g_id4            1.60      0.25     1.15     2.16 1.02      288      766
Phi_g_id5            9.40      1.77     6.39    13.30 1.01      410     1022
Phi_g_id6            4.10      0.63     3.04     5.47 1.02      264      707
Phi_g_id7            3.30      0.50     2.42     4.38 1.02      273      727
Phi_g_id8            1.55      0.25     1.12     2.09 1.01      317      874
 [ reached getOption("max.print") -- omitted 192 rows ]

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
>

As you can see, the ESS for the int_categoryX parameters have much more efficient sampling than the other two categories. Are there settings or rescalings I can do to even the sampling out and, thus, presumably, make the model run more efficiently?

Intuitively I take this to mean we have a lot more information on the int_categoryX parameters, but ideally I would want that in my posterior intervals, not my ESS.

For reference, here’s some slices of the data

> sdata_2c[c(1:5,1000:1005, 10000:10005), ]
# A tibble: 17 × 5
   g_id  category    phi    pr index
   <fct> <fct>     <dbl> <dbl> <fct>
 1 1     9        0.0185 0.470 1    
 2 1     2        0.0185 0.596 0    
 3 1     4        0.0185 0.446 0    
 4 1     8        0.0185 0.464 1    
 5 1     1        0.0185 0.572 0    
 6 3     9        0.345  0.561 1    
 7 3     3        0.345  0.518 0    
 8 3     9        0.345  0.561 0    
 9 3     7        0.345  0.502 1    
10 3     3        0.345  0.518 0    
11 3     1        0.345  0.563 1    
12 25    9        0.113  0.496 1    
13 25    5        0.113  0.554 1    
14 25    5        0.113  0.554 0    
15 25    9        0.113  0.496 0    
16 25    3        0.113  0.497 1    
17 25    1        0.113  0.569 1 
  • I guess some of the groups given g_id have only a few observations making the both (due to the interaction) slope and Phi weakly informed by the likelihood. More informative priors might help.
  • When Bulk-ESS is lower than Tail-ESS, there might be multimodality issue, which might be seen also in posterior marginals (1D marginals or 2D scatter plots)