Workaround using truncated parameters for indexing?

hneyster · September 28, 2022, 6:58pm

Hi all,

I am trying to translate a method from Jags to Stan that estimates the ‘true’ effective scale of an environmental variable. Among other things, the model estimates the parameter (SS), which indexes an array of environmental variables (E, where each column is the environment at a different scale). In Jags, this was done by truncating the parameter, and then using linear interpolation:

SS.trunc = trunc(SS)
prop.scales = SS-SS.trunc
\boldsymbol{w}[j] = \boldsymbol{E}[j,SS.trunc]*(1-prop.scales) + \boldsymbol{E}[j, SS.trunc + 1)]*prop.scales

Where w is the true effect of the environment and j is the sample site.

Because the method does use linear interpolation and each environmental scale is quite similar to the others, it might still produce a smooth gradient?

But since the output from Stan’s trunc function is a real number and can’t be used to index arrays (according to the Stan Functions Reference), I can’t figure out how to specify this model in Stan.

Any ideas about any alternative ways to specify this model in Stan?

Thank you!

andrjohns · September 29, 2022, 6:30am

The simplest approach here would be to iteratively lookup the best-matching integer for indexing and define the likelihood using that:

stan_interp = "
functions {
  vector stan_interp(real SS, matrix E) {
    int index = cols(E);

    // If SS is already above the maximum index then
    // we can skip the interpolation
    if (index < SS) {
      return E[:, index];
    }

    while (index > SS) {
      index -= 1;
    }
    
    real prop_scales = SS - index;

    // Assumes same value of SS for every row
    return E[:,index] * (1 - prop_scales) + E[:,index + 1] * prop_scales;
  }
}
"

rstan::expose_stan_functions(rstan::stanc(model_code=stan_interp))

r_interp <- function(SS, E) {
  if (SS > ncol(E)) {
    return(E[, ncol(E)])
  }
  SS_trunc = as.integer(trunc(SS))
  prop_scales = SS - SS_trunc
  E[,SS_trunc] * (1 - prop_scales) + E[,SS_trunc + 1] * prop_scales
}

E <- matrix(1:4, nrow=2, ncol=2)
E
#>      [,1] [,2]
#> [1,]    1    3
#> [2,]    2    4

r_interp(1.4, E)
#> [1] 1.8 2.8
stan_interp(1.4, E)
#> [1] 1.8 2.8

# No interpolation needed
r_interp(2.4, E)
#> [1] 3 4
stan_interp(2.4, E)
#> [1] 3 4

^{Created on 2022-09-29 with reprex v2.0.2}

If you run into inefficient/poor sampling with the above approach, then you’ll likely need to resort to marginalisation or something similar. If you share the likelihood/distribution being used for SS then I can see if there’s an alternative approach

hneyster · September 30, 2022, 5:19pm

Thanks so much! I’ll try this on some simulated data and see if it yields efficient sampling.

hneyster · November 8, 2022, 5:49pm

Your solution works well on simulated data – thanks so much for your help!

Topic		Replies	Views
Using estimated variables as index Modeling	4	344	December 5, 2023
How to do matrix indexing in Stan General	0	451	March 21, 2019
Linear interpolation and searchsorted in stan Modeling	5	1787	April 16, 2021
Index variable in multilevel model Modeling	2	223	March 14, 2024
Rounding parameters to define rows in 'look up table' Modeling	3	454	May 24, 2019

Workaround using truncated parameters for indexing?

Related topics