Workaround using truncated parameters for indexing?

Hi all,

I am trying to translate a method from Jags to Stan that estimates the ‘true’ effective scale of an environmental variable. Among other things, the model estimates the parameter (SS), which indexes an array of environmental variables (E, where each column is the environment at a different scale). In Jags, this was done by truncating the parameter, and then using linear interpolation:

SS.trunc = trunc(SS)
prop.scales = SS-SS.trunc
\boldsymbol{w}[j] = \boldsymbol{E}[j,SS.trunc]*(1-prop.scales) + \boldsymbol{E}[j, SS.trunc + 1)]*prop.scales

Where w is the true effect of the environment and j is the sample site.

Because the method does use linear interpolation and each environmental scale is quite similar to the others, it might still produce a smooth gradient?

But since the output from Stan’s trunc function is a real number and can’t be used to index arrays (according to the Stan Functions Reference), I can’t figure out how to specify this model in Stan.

Any ideas about any alternative ways to specify this model in Stan?

Thank you!

The simplest approach here would be to iteratively lookup the best-matching integer for indexing and define the likelihood using that:

stan_interp = "
functions {
  vector stan_interp(real SS, matrix E) {
    int index = cols(E);

    // If SS is already above the maximum index then
    // we can skip the interpolation
    if (index < SS) {
      return E[:, index];
    }

    while (index > SS) {
      index -= 1;
    }
    
    real prop_scales = SS - index;

    // Assumes same value of SS for every row
    return E[:,index] * (1 - prop_scales) + E[:,index + 1] * prop_scales;
  }
}
"

rstan::expose_stan_functions(rstan::stanc(model_code=stan_interp))

r_interp <- function(SS, E) {
  if (SS > ncol(E)) {
    return(E[, ncol(E)])
  }
  SS_trunc = as.integer(trunc(SS))
  prop_scales = SS - SS_trunc
  E[,SS_trunc] * (1 - prop_scales) + E[,SS_trunc + 1] * prop_scales
}

E <- matrix(1:4, nrow=2, ncol=2)
E
#>      [,1] [,2]
#> [1,]    1    3
#> [2,]    2    4

r_interp(1.4, E)
#> [1] 1.8 2.8
stan_interp(1.4, E)
#> [1] 1.8 2.8

# No interpolation needed
r_interp(2.4, E)
#> [1] 3 4
stan_interp(2.4, E)
#> [1] 3 4

Created on 2022-09-29 with reprex v2.0.2

If you run into inefficient/poor sampling with the above approach, then you’ll likely need to resort to marginalisation or something similar. If you share the likelihood/distribution being used for SS then I can see if there’s an alternative approach

1 Like

Thanks so much! I’ll try this on some simulated data and see if it yields efficient sampling.

1 Like

Your solution works well on simulated data – thanks so much for your help!