I am tryin to fit a simple spatial GP model - first simulating the field with the R package geoR , function geoR::grf using an exponential covariance function with variance and range parameters (0.8 and 0.5) respectively. Then fitting the model in Stan does not recover the parameters (nor their product) - so I am wondering if I am doing something wrong.
The Stan code generally looks correct, though using the phi_mean ~ normal(0,100) sampling notation is probably preferred. It looks like the posterior credible intervals do contain the true parameter values. I would first suggest retrying with more simulated data. It’s not necessarily guaranteed that the posterior means will exactly recover the generative parameters (only in expectation). If problems persist here are some suggestions for troubleshooting:
Normal(0,100) is a very wide prior for the spatial process mean, implying that it could plausibly be between about -200 and 200. Does this make sense for your simulated data? Does the problem go away if you use e.g. Normal(0,1)?
Try directly simulating the field without relying on geoR or double checking with a different package to verify there isn’t an issue with that package or a mistranslation of the process parameters
Depends who you ask! I like the distribution notation. To help, we renamed it from “sampling statement” to “distribution statement” because it’s normally read “is distributed as”. Many folks, including some of our developers, are lobbying us to remove the ~ notation altogether because they find it’s confusing users because it’s not actually generating phi_mean by sampling a normal(0, 100). The exact equivalent to the sampling statement with target += is
target += normal_lupdf(phi_mean | 0, 100);
where the lupdf has the extra u for unnormalized, which drops constants like the distribution statement.
Like @js592, I’d also recommend tighter priors. The gamma prior on sigma2 has a mean of 8 (2 / 0.25) and a standard deviation of roughly 6 in case you thought we were using a different parameterization of gamma than shape and rate (aka inverse scale).
Otherwise, I’d recommend @js592’s advice of doing the simulation directly or in Stan. At the lowest level, I’d at least check that the different distributions have the same parameterizations.
Ah - indeed I mistakenly thought that the gamma distribution was parametrized with shape and scale. Thanks for catching that.
Thanks both for the recommendations. That helped.