# Recovering lscale in a two-dimensional Gaussian process regression

In order to better work with two-dimensional Gaussian process models in `brms` I would like to generate data with known parameters. Given that `brms` only has the exponential quadratic kernel implemented, in particular I’d like to see if I can recover the `lscale` parameter of this kernel.

In the following code, I generate data on a grid, and attempt to recover `lscale = 0.5`, and `sdx1x2 = 0.25`. Despite a sample size of 100 (the highest I want to go due to speed constraints), I am not able to recover a parameter value close to the `lscale` I used to generate the data, even with the `gp(..., scale=FALSE)` argument set. The `sdx1x2` parameter is also off. The model samples well.

Is this because I am misinterpreting the `lscale` parameter in some way, or have generated the data incorrectly. Alternatively, is it a sample size issue (i.e. more data are required to ensure generated data converges on intended parameters)?

``````library(MASS)
library(brms)

## Generate data with a exponential quadratic kernel
## covariance structure in two dimensions

set.seed(186429)
## Specify parameters
N <- 100
lscale <- 0.5
sdx1x2 <- 0.25
b0 <- 0

## Generate exponential quadratic kernel covariance given lscale
exp_quad <- function(D, lscale, eps=sqrt(.Machine\$double.eps)) {
Sigma <- exp(-D/2*lscale^2) + diag(eps, nrow(D))
}
rows <- seq(0, 1, length.out=trunc(sqrt(N)))
cols <- seq(0, 1, length.out=trunc(sqrt(N)))
df <- expand.grid(x1=rows, x2=cols)
Sigma <- exp_quad(as.matrix(dist(cbind(df\$x1, df\$x2))), lscale)
## Ensure variance reflects sdx1x2
Sigma <- Sigma * sdx1x2^2

## Draw from this Gaussian process
df\$gpx1x2 <- mvrnorm(n=1, mu=rep(0, nrow(Sigma)), Sigma=Sigma)

## Generate data y ~ b0 + gpx1x2
df\$y <- b0 + df\$gpx1x2

## Try to recover parameters using brms
## set gp(..., scale=FALSE) to maximize interpretability
m <- brm(y ~ gp(x1, x2, scale=FALSE), data=df,
iter=2000, cores=4, refresh=1)

summary(m)
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ gp(x1, x2, scale = FALSE)
Data: df (Number of observations: 100)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000

Gaussian Process Terms:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sdgp(gpx1x2)       0.06      0.02     0.04     0.10       1043 1.00
lscale(gpx1x2)     0.25      0.03     0.19     0.33        643 1.00

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept     0.38      0.03     0.33     0.43       1502 1.00

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.02      0.00     0.02     0.03       1818 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

``````

That multiplies by `lscale^2`. Take a look at @bbbales2’s version:

``````cov_exp_quad = function(x, y, sigma, l) {
outer(x, y, function(xi, xj) { sigma^2 * exp(-(xi - xj)^2 / (2 * l^2)) })
}``````

This was very helpful @andre.pfeuffer . It demonstrated that the main issue with my code is that I was not squaring the distance, and secondarily I had the brackets in the wrong place.

Replace the `exp_quad()` function with this code:

``````exp_quad <- function(D, lscale, eps=sqrt(.Machine\$double.eps)) {
Sigma <- exp(-D^2/(2*lscale^2)) + diag(eps, nrow(D))
}
``````

``````df\$y <- b0 + df\$gpx1x2 + rnorm(nrow(Sigma), sd=0.1)
I took my formula for exponential quadratic from the `brms::gp()` help. Here is the rendered version: