Case study on spatial models for areal data - Poisson CAR/IAR

Hi Katherine,

Mitzi’s model assumes that the graph has only one connected component. So it doesn’t work with “islands”. In the event that you do have islands, the scaling will be wrong, but more critically, the constraints are wrong. You need to make it sum to zero on each connected component. INLA does all of this automatically, but for the moment you’ve got to do it manually in Stan.

Those alternate adjacency methods seem like a good idea.

Hi Bob

Thanks for your answer

This is interesting. I usually take a burn-in period of about 10% of the whole simulation. In my opinion this is a safer implementation of Geyer’s advice (Geyer (1992). Practical Markov Chain Monte-Carlo. Statistical Science, 7(4):473-511) who propose to set a burn-in periodo of 1-2% of the total simulation. In my experience with WinBUGS this is a reasonable advice, surely more efficient than taking 50% of the simulations as burn-in. But, is a 10% burn-in period an adequate choice for Stan? Does HMCMC requires/advices for longer burn-in periods? Does Stan tune its HMCMC algorithms during the burn-in period and setting a short warm-up period could harm Stan’s performance?

Apologizes if this is a bit off-topic but I have prefered to ask it here in response to Bob’s response :).

Yes.

With HMC/NUTS the warm-up period includes tuning the mass matrix and stepsize in addition to discovering the main mass of the distribution so yeah, for most interesting models the full warm-up is a good idea. The up side is that you only need a small number of iterations to get the required effective sample size. Even in complex models I often need only 1-2 iterations per effective sample (yeah, sometimes it’s not that good, but those are cases where M-H would fail completely). so if you’re getting a preliminary estimate of a mean you only need 30-100 iterations post-warm-up, and maybe 300 for credible intervals. For final estimates you often need more to get the MCMC error down.

3 Likes

Don’t follow Geyer on this, as he’s making a lot of assumptions that make sense theoretically, but not practically. If you have a chain that you know is going to converge (e.g., it satisfies geometric ergodicity) and you are going to run it for a bajillion iterations, then his advice is OK, but still dangerous. For real world problems, you should be running multiple chains.

Our reasoning was that you’ll lose at most a factor of two in speed by having warmup be half the iterations.

As @sakrejda hints, generally, people run for way more effective samples than they need, too.

Hi Daniel and Mitzi,

Thank you for the clarification. I’m currently looking at the “5 nearest neighbors” approach with NYC Planning’s 2010 Census Tract (clipped to shoreline). I made one manual connection between Brooklyn and Staten Island by the Verrazano Bridge, yielding a fully connected map.

However, in the final step prior to running the model, where the adjacency matrix and scaling are set up, I’m now getting an error message as follows. I’m stumped as to what it means:

This is INLA_17.06.20 built 2017-06-20 03:42:30 UTC.
See R-INLA Project - Contact us for how to get help.
Error in validObject(r) :
invalid class “dsTMatrix” object: all row indices (slot ‘i’) must be between 0 and nrow-1 in a TsparseMatrix
In addition: Warning messages:
1: package ‘INLA’ was built under R version 3.4.0
2: In fun(libname, pkgname) : bytecode version mismatch; using eval

Here is the code I used:

#Run ICAR model using STAN - developed by Mitzi Morris et al
library(rstan);
options(mc.cores = parallel::detectCores());
library(maptools);
library(spdep);
library(tripack);
library(rgdal)

nyckidped_tract<-readRDS(file=“D:/NYC/nyckidped_tract.rds”)
#source(“nyckidped_tract.rds”);
y = nyckidped_tract$count;
x = nyckidped_tract$expected;

nyc_poptractIDs<-nyckidped_tract$ct2010full
head(nyc_poptractIDs)

nyc_all_tracts.shp<-readOGR(“D:/Census/nycTracts10/nyct2010.shp”, layer=“nyct2010”);
print(nyc_all_tracts.shp)
table(nyc_all_tracts.shp$CT2010)
table(nyc_all_tracts.shp$BoroCode, nyc_all_tracts.shp$BoroName)

nyc_all_tracts.shp$BoroFips<-ifelse(nyc_all_tracts.shp$BoroCode==1, “061”, ifelse(nyc_all_tracts.shp$BoroCode==2, “005”, ifelse(nyc_all_tracts.shp$BoroCode==3, “047”, ifelse(nyc_all_tracts.shp$BoroCode==4, “081”, ifelse(nyc_all_tracts.shp$BoroCode==5, “085”, “”)))))
table(nyc_all_tracts.shp$BoroName, nyc_all_tracts.shp$BoroFips)

nyc_all_tracts.shp$CT2010full<-paste0(“36”,nyc_all_tracts.shp$BoroFips,nyc_all_tracts.shp$CT2010)
table(nyc_all_tracts.shp$CT2010full)

pop_tracts ← nyc_all_tracts.shp$CT2010full %in% nyc_poptractIDs;
nyc_poptracts.shp ← nyc_all_tracts.shp[pop_tracts,];
nyc_poptracts.shp ← nyc_poptracts.shp[order(nyc_poptracts.shp$BoroCT2010),];

#nearest neighbors
nb_nyc_5nn ← knn2nb(knearneigh(coords,k=5), row.names=IDs);
plot(nb_nyc_5nn, coords, pch = “.”, col = “red”);

#Manually add connection between Brooklyn and Staten Island
nb_nyc_5nn1n <-edit.nb(nb_nyc_5nn, coords)
#Identifying contiguity for deletion …
#No contiguity between chosen points
#Add contiguity? (y/n) y
#added contiguity between point 733 and 2018
#Options: quit[q] refresh[r] continue[c] r
#Options: quit[q] continue[c]q

plot(nb_nyc_5nn1n, coords, pch = “.”, col = “red”);

#source(“nbdata4stan.R”);
nbdata4stan = function(x) {
N = length(x);
n_links = 0;
for (i in 1:N) {
if (x[[i]][1] != 0) {
n_links = n_links + length(x[[i]]);
}
}
N_edges = n_links / 2;
node1 = vector(mode=“numeric”, length=N_edges);
node2 = vector(mode=“numeric”, length=N_edges);
idx = 0;
for (i in 1:N) {
if (x[[i]][1] != 0) {
for (j in 1:length(x[[i]])) {
n2 = unlist(x[[i]][j]);
if (i < n2) {
idx = idx + 1;
node1[idx] = i;
node2[idx] = n2;
}
}
}
}
return (list(“N”=N,“N_edges”=N_edges,“node1”=node1,“node2”=node2));
}

nbs=nbdata4stan(nb_nyc_5nn1n);
N = nbs$N;
node1 = nbs$node1;
node2 = nbs$node2;
N_edges = nbs$N_edges;

DO_CALC = TRUE;
#Calculate the scaling of the model. Requires the INLA package.

if(DO_CALC) {
library(INLA)
#Build the adjacency matrix
adj.matrix = sparseMatrix(i=nbs$node1,j=nbs$node2,x=1,symmetric=TRUE)
#The ICAR precision matrix (note! This is singular)
Q= Diagonal(nbs$N, rowSums(adj.matrix)) - adj.matrix

#Add a small jitter to the diagonal for numerical stability (optional but recommended)
Q_pert = Q + Diagonal(nbs$N) * max(diag(Q)) * sqrt(.Machine$double.eps)

Compute the diagonal elements of the covariance matrix subject to the

constraint that the entries of the ICAR sum to zero.

#See the function help for further details.
Q_inv = inla.qinv(Q_pert, constr=list(A = matrix(1,1,nbs$N),e=0))

#Compute the geometric mean of the variances, which are on the diagonal of Q.inv
scaling_factor = exp(mean(log(diag(Q_inv))))
} else{
scaling_factor= 0.5;
}

bym2_nyc_stanfit = stan(“D:/NYC/bym2_predictor_only.stan”, data=list(N,N_edges,node1,node2,y,x,scaling_factor), iter=6000);
print(bym2_nyc_stanfit, digits=3, pars=c(“lp__”, “beta0”, “beta1”, “rho”, “sigma”, “mu[1]”, “mu[2]”, “mu[3]”, “mu[2116]”, “mu[2117]”, “mu[2118]”), probs=c(0.025, 0.5, 0.975));

Katie

Can you type traceback() and send the output ?

Edit: also str(Q)

traceback()

3: stop(msg, ": ", errors, domain = NA)
2: validObject(r)
1: sparseMatrix(i = nbs$node1, j = nbs$node2, x = 1, symmetric = TRUE) at #4

str(Q)
Error in str(Q) : object ‘Q’ not found

Ok the Q wasn’t made which means that this bit of the code wasn’t run

It gives the following error, again. Perhaps the output from the nearest neighbor function differs from poly2nb? I’ll have to check more closely at a later time.

Error in validObject(r) :
invalid class “dsTMatrix” object: all row indices (slot ‘i’) must be between 0 and nrow-1 in a TsparseMatrix

If you email me enough to run the code (dp(dot)simpson(at)gmail(dot)com) I’ll look at it for you.

You really need to do this kind of thing with gmail? Is their spam filter that bad?

1 Like

I’ve been told more than ten years ago that email harvesters can decode (dot) and (at)

It wouldn’t be exactly rocket science to do, would it

This might be getting OT and maybe needs its own thread… ¯\༼ ି ~ ି ༽

Ok. I know where the problem is. And I think I know how to fix it.

The problem is that nbdata4stan assumes it gets a list of all the neighbours for each node (excluding the node number itself). It then makes an assumption about symmetry - namely that if node i has neighbour j, then node j has neighbour i and codes everything accordingly. This is not true for output for knn2nb! You can see this by running

is.symmetric.nb( knn2nb(knearneigh(coords,k=5), row.names=IDs)) #Output: FALSE

The fix is to replace

nb_nyc_5nn ← knn2nb(knearneigh(coords,k=5), row.names=IDs); #bad

with

nb_nyc_5nn ← knn2nb(knearneigh(coords,k=5), row.names=IDs, sym=TRUE); #good

With this change, the rest of the function works.

NB: This was tested on nb_nyc_5nn rather than nb_nyc_5nn1 because I couldn’t make the latter work.

You can always use the functions is.symmetric.nb (desired output: TRUE) and make.sym.nb (in the event that the nb is not symmetric) to make sure everythign is still ok with nb_nyc_5nn1.)

EDIT: Here is the “safe” version of nbdata4stan

nbdata4stan = function(x) {

if (! is.symmetric.nb(x)){
stop(“The neihbourhood you provided is not symmetric. Use make.sym.nb to fix this!”)
}

N = length(x);
n_links = 0;
for (i in 1:N) {
if (x[[i]][1] != 0) {
n_links = n_links + length(x[[i]]);
}
}
N_edges = n_links / 2;
node1 = vector(mode=“numeric”, length=N_edges);
node2 = vector(mode=“numeric”, length=N_edges);
idx = 0;
for (i in 1:N) {
if (x[[i]][1] != 0) {
idx_tmp=idx
for (j in 1:length(x[[i]])) {
n2 = unlist(x[[i]][j]);

    if (i < n2) { 
      idx = idx + 1; 
      node1[idx] = i; 
      node2[idx] = n2; 
    } 
  } 
  if ( idx_tmp==idx) { print(i)}
} 

}
return (list(“N”=N,“N_edges”=N_edges,“node1”=node1,“node2”=node2));
}

Ok. I know where the problem is. And I think I know how to fix it.

many thanks for identifying and fixing the problem!
I’ll update the code for this function in the case study accordingly!

It indeed ran successfully, and the speed improved as well (~20 min). The results look much better:

print(bym2_nyc_stanfit, digits=3, pars=c(“lp__”, “beta0”, “beta1”, “rho”, “sigma”, “mu[1]”, “mu[2]”, “mu[3]”, “mu[4]”, “mu[5]”, “mu[1001]”, “mu[1002]”,“mu[1003]”,“mu[1004]”,“mu[1005]”, “mu[2114]”, “mu[2115]”, “mu[2116]”, “mu[2117]”, “mu[2118]”), probs=c(0.025, 0.5, 0.975));
Inference for Stan model: bym2_predictor_only.
4 chains, each with iter=6000; warmup=3000; thin=1;
post-warmup draws per chain=3000, total post-warmup draws=12000.

>              mean se_mean     sd      2.5%       50%     97.5% n_eff  Rhat
> lp__     15069.600   0.985 57.223 14960.410 15069.149 15182.101  3372 1.000
> beta0        1.073   0.001  0.036     1.002     1.074     1.145  3066 1.000
> beta1        0.078   0.000  0.004     0.070     0.078     0.087  2716 1.000
> rho          0.578   0.002  0.055     0.468     0.579     0.682   537 1.005
> sigma        0.877   0.002  0.038     0.807     0.875     0.956   605 1.004
> mu[1]        7.641   0.026  2.807     3.382     7.239    14.229 12000 1.000
> mu[2]       12.552   0.035  3.840     6.442    12.050    21.250 12000 1.000
> mu[3]       15.450   0.040  4.383     8.206    14.980    25.361 12000 1.000
> mu[4]       16.739   0.045  4.975     8.895    16.157    28.099 12000 1.000
> mu[5]       17.434   0.042  4.547     9.881    16.965    27.448 12000 1.000
> mu[1001]     5.174   0.020  2.230     1.971     4.815    10.450 12000 1.000
> mu[1002]    11.701   0.035  3.884     5.523    11.236    20.666 12000 1.000
> mu[1003]    12.255   0.035  3.857     6.135    11.833    21.126 12000 1.000
> mu[1004]     3.047   0.014  1.519     0.940     2.750     6.756 12000 1.000
> mu[1005]     5.419   0.020  2.243     2.092     5.064    10.702 12000 1.000
> mu[2114]     6.920   0.024  2.673     2.877     6.507    13.159 12000 1.000
> mu[2115]    16.810   0.042  4.597     9.260    16.331    27.131 12000 1.000
> mu[2116]     4.596   0.018  1.955     1.707     4.281     9.221 12000 1.000
> mu[2117]     7.212   0.024  2.677     3.132     6.819    13.523 12000 1.000
> mu[2118]     7.993   0.028  3.016     3.408     7.545    15.094 12000 1.000

Correspondingly below, the input count (y) and expected (x) values (i.e., citywide rate applied to census tract population) for the same 15 observations. Data usage note: original crash files were provided by NYC DOT. We assume responsibility for the accuracy of the aggregate data used for analysis, as shown.

print(nyckidped_tract[c(1:5,1001:1005,2114:2118),])
      ct2010full                                        geoname      years agegroup       type count  expected
2    36005000200         Census Tract 2, Bronx County, New York 2011to2014 0to19yrs Pedestrian     7  9.712549
3    36005000400         Census Tract 4, Bronx County, New York 2011to2014 0to19yrs Pedestrian    12 12.694855
4    36005001600        Census Tract 16, Bronx County, New York 2011to2014 0to19yrs Pedestrian    15 12.505271
5    36005001900        Census Tract 19, Bronx County, New York 2011to2014 0to19yrs Pedestrian    16  3.930228
6    36005002000        Census Tract 20, Bronx County, New York 2011to2014 0to19yrs Pedestrian    17 21.619900
1017 36047097000       Census Tract 970, Kings County, New York 2011to2014 0to19yrs Pedestrian     5  3.901061
1018 36047097400       Census Tract 974, Kings County, New York 2011to2014 0to19yrs Pedestrian    11  4.783357
1019 36047098200       Census Tract 982, Kings County, New York 2011to2014 0to19yrs Pedestrian    12 11.149014
1020 36047098400       Census Tract 984, Kings County, New York 2011to2014 0to19yrs Pedestrian     2  3.944811
1021 36047098600       Census Tract 986, Kings County, New York 2011to2014 0to19yrs Pedestrian     5  5.534403
2163 36085030301 Census Tract 303.01, Richmond County, New York 2011to2014 0to19yrs Pedestrian     7  9.406297
2164 36085030302 Census Tract 303.02, Richmond County, New York 2011to2014 0to19yrs Pedestrian    17 14.094862
2165 36085031901 Census Tract 319.01, Richmond County, New York 2011to2014 0to19yrs Pedestrian     4  8.866711
2166 36085031902 Census Tract 319.02, Richmond County, New York 2011to2014 0to19yrs Pedestrian     7 13.022982
2167 36085032300    Census Tract 323, Richmond County, New York 2011to2014 0to19yrs Pedestrian     9  2.501054

It looks like we are well on our way! As a sensitivity analysis I’ll try running with different neighborhood adjacency methods.

Cheers!

1 Like

Great news! Thanks for reporting back with successful run results. It makes us very happy.

2 Likes

I’ve just found this paper on islands: A note on intrinsic Conditional Autoregressive models for disconnected graphs, Anna Freni-Sterrantino et al, [1705.04854] A note on intrinsic Conditional Autoregressive models for disconnected graphs
can the Stan model be coded-up accordingly?