Case study on spatial models for areal data - Poisson CAR/IAR

mitzimorris · May 9, 2017, 8:45pm

I’ve got a writeup with example models checked into stan-dev/example-models repo:
https://github.com/stan-dev/example-models/tree/IAR-case-study

the case study is in directory example-models/knitr/car-iar-poisson.

It contains a short summary of the math for CAR/IAR models, followed by example data taken from Charles DiMaggio. Feedback, comments, suggestions, and above all corrections welcome!

cheers,
Mitzi

flaxter · May 9, 2017, 9:02pm

Hi Mitzi. Cool! For pedagogical purposes, it might be nice to show (excerpts) of the Stan output at the end. How’d you know that the model didn’t fit? How slow is slow? Etc.

flaxter · May 9, 2017, 9:06pm

also is there a way for https://github.com/stan-dev/example-models/blob/IAR-case-study/knitr/car-iar-poisson/IAR_Stan.Rmd to render latex when I view it in github?

mitzimorris · May 9, 2017, 9:40pm

I only checked in the Rmd - if you run it through knitr, you’ll see the Stan plots. also, if you view the Rmd file from within RStudio, the latex formulas will render nicely.

betanalpha · May 10, 2017, 3:15pm

This looks great!

There is some significant overhead it getting to the HTML as the R requires a ton of domain-specific R packages and some of the Stan models take a few minutes to run. Could you check in the generated HTML and any necessary resources so it can be read immediately without rendering?

Possible to have a figure demonstrating a lattice like model highlighting the Markov blanket?

You note that spatio-temporal data is typically 1-4 dimensions, perhaps it’s worth explicitly noting this is due to there being 1-3 spatial dimensions and 1 time dimension?

To avoid confusion I suggest using “random variable” everywhere instead of shortening to RV.

You have two tau’s missing a proceeding slash after describing the “pairwise difference form”.

Maybe note that INLA is an alternative to MCMC with difference speed and accuracy tradeoffs. ;-)

Using {r, comment=NA} to open a block of R code will drop the “##” that prefix each output line.

I recommend adding these two bits to the end:

writeLines(readLines(file.path(Sys.getenv("HOME"), ".R/Makevars")))

devtools::session_info("rstan")

Also, you can get the references printed in the footer if you add

bibliography: divergences_and_bias.bib

to the knitr header.

mitzimorris · May 19, 2017, 6:26pm

Made changes to the case study per Micheal’s suggestions, and merged my branch with master:

As noted, the Rmd document takes a while to build, and it does require a whole bunch of R libraries, so I checked in the generated HTML file as well. But GitHub won’t show it to you, so you’ll have to download it or check out the repo locally.

cheers,
Mitzi

betanalpha · May 19, 2017, 6:34pm

This looks awesome, especially the maps at the end!

If I could make some very minor, pedantic comments: you never define for what the acronyms CAR and IAR stand and the equation for the density of w should use a \exp instead of the raw exp.

mitzimorris · May 19, 2017, 7:33pm

thanks! changed made; html regenerated. should we add this to our list of case studies on the website?

cheers,
Mitzi

betanalpha · May 19, 2017, 9:08pm

Please! You can push directly to the website repo or I can do it later if you’d prefer.

mitzimorris · May 23, 2017, 6:04pm

added and pushed.
http://mc-stan.org/documentation/case-studies/IAR_Stan.html

mitzimorris · June 21, 2017, 7:22am

case study has undergone a complete rewrite. thanks to help from Andrew and Bob, there’s now an correct and efficient implementation of Besag’s IAR spatial component. comparing the Stan fit with the result of running Brad Carlin’s WinBUGS model gets similar but not matching results, so either the two models differ or the samplers differ. pushed all of the models, data, R scripts, and the writeup to GitHub:

feedback and comments very much appreciated.

the earlier versions of the IAR model had problems, so if you were using them, don’t and please try the new version.

mitzimorris · July 5, 2017, 10:25pm

I’ve reworked the case study yet again
executive summary:

complete derivation of “pairwise difference” formula
correctness check of CAR normal prior against WinBUGS car.normal function
mo’ better discussion of the Besag York Mollié model.

I’d love to get some feedback before putting this back up as a Stan case study.

like Bullwinkle says “this time for sure!” https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0ahUKEwiSwa_klvPUAhUBeT4KHcNpCZUQtwIIJDAA&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dpc4IFIXcDcs&usg=AFQjCNGTe3H_TcFWGieWpctuFEwrDYuY6w

anon75146577 · July 6, 2017, 7:58pm

This is fabulous!

Some weird thoughts (that may or may not be useful):

I thought ICAR was the more common terminology (rather than IAR) for spatial models
In the final paragraph of the IAR section, it might be useful to say that the reason it’s non-generative is that it puts a flat prior on the intercept (that’s what mean-invarinace means in this context). So an ICAR model has a Jeffrey’s prior in one dimension (the overall intercept), and an informative prior in all dimensions orthogonal to that. (Writing that down right now, that may be too mathematical to be useful)
I don’t know what the Stan case study standard is, but I prefer my matrices to have ( . ) brackets, as the vertical lines make me think of determinants. In LaTeX that’s \begin{pmatrix}\end{pmatrix}
BYM section: I have very very very strong feelings about priors in this context (mainly that you shouldn’t use this parameterisation of the BYM). I don’t know if these case studies should document historical practice or best practice, but if the latter, there’s a long history of doing better (https://arxiv.org/abs/1601.01180 and refs within)

If you want some INLA comparisons in there I can do that for you fairly quickly.

anon75146577 · July 6, 2017, 8:34pm

Probably my other comment (which is not unique to this case study) is that this is very long and would benefit from a table of contents! Maybe with sections:

What is a conditional autoregressive model
Adding an ICAR component to a Stan model
Example: Disease mapping
so that people can find the bit that they’re looking for.

Bob_Carpenter · July 6, 2017, 9:04pm

Yes, real matrix notation please.

Comparisons with INLA would be fantastic, but that can go in the follow-on case study Mitzi’s going to do comparing to BUGS.

Best practice over historical practice. I’ve gone with some bad priors when trying to exactly replicate another study, but now that this is no longer comparing directly to BUGS, that’s no longer a concern.

Not sure what you meant by “parameterization of the BYM”. You mean the sum-to-zero constraint or the priors? Now that the goal isn’t to reproduce the GeoBUGS results exactly (we now think they are broken), we can just use our own priors.

anon75146577 · July 6, 2017, 9:06pm

I’m doing up something now to explain the parameterisation thing, although I’ve got to go see a depressing play this evening so it may take until tomorrow.

anon75146577 · July 6, 2017, 9:08pm

I’m not sure how to do this in github. Do I make a pull request with the changes? Or is there a different way.

betanalpha · July 6, 2017, 9:28pm

What’s more depressing that bad priors?

With everything that Mitzi has put into place it should be easy to modify/experiment with priors. The site-specific random effect prior is definitely an odd one!

I agree that a table of contents would be awesome.

betanalpha · July 6, 2017, 10:40pm

The matrices are also big enough that I feel like they would be better as not inline equations.

Also, was the line “indexing-fu plus vectorization!” intentional? It’s italicization and location right after a code block makes it someone ambiguous as to whether it was a left over excited comment or an intentional excited comment.

mitzimorris · July 7, 2017, 2:12pm

GitHub is the way to go.
I’m going to make the edits you suggested w/r/t to ToC, ICAR, and matrix formatting.

many thanks for the great feedback. reading the priors paper now.
cheers,
Mitzi

Topic		Replies	Views
BYM2 with unstructured country random effect Modeling	39	2957	February 11, 2022
Convergence / latest version of Stan issues with our covid-19 model? Modeling fitting-issues , epidemiology , covid-19	84	5486	May 9, 2020
Blog: A gentle Stan to INLA comparison Publicity	18	6700	June 5, 2018
My first Stan model - hierarchical logistic regression Modeling cognitive-science	28	8719	June 6, 2017
Stan on computing cluster: strange results CmdStan	11	1730	June 8, 2018

Case study on spatial models for areal data - Poisson CAR/IAR

Related topics