Random walk in RStan

I recently attempted using random walk in the rethinking::ulam interface to RStan and failed. @richard_mcelreath replied that ulam does not currently support random walk.

Now I am looking into achieving this directly in Stan. I found several previous discussions on the topic in this forum, e.g. here and here, but still do not grasp the basics of implementing random walk in Stan. Can someone recommend a textbook, vignette or other detailed guide?

Specifically, I am trying to model the decay rate parameter k of the exponential decay function using random walk to allow it to vary within limits from one step of the predictor to the next. The idea came from this paper and this PhD thesis, but the authors use LibBi instead of Stan. My failed attempt MRE in R is this:

# generate data
set.seed(10)
x <- 1:250
r <- rnorm(x, mean = 0, sd = sqrt(x^-1.5))
y <- 2 * exp(-0.04 * x) + r

require(tidyverse)
df <- tibble(x = x, y = y)
ggplot(df, aes(x, y)) +
  geom_point() +
  theme_minimal()

# prepare data for ulam
l <- as.list(df)
l$N <- length(l$x)

# run model
require(rethinking)
rw.mod <- ulam(
  alist(
    y ~ dnorm(mean = mu, sd = sigma),
    mu <- a * exp(-k * x),
    k <- 0.04,
    for (i in 2:N) {
      k[i] <- k[i - 1] + r
    },
    r ~ dnorm(mean = 0, sd = tau),
    a ~ dlnorm(meanlog = log(2), sdlog = 0.5),
    c(sigma, tau) ~ dexp(rate = 1)
  ), data = l, chains = 8, cores = parallel::detectCores(), iter = 1e4
)

Any help is greatly appreciated!

maybe not really related, but i don’t know what the ulam is for

Thanks for the reply @aakhmetz. rethinking::ulam is an R function that I have used in the past to drive Stan. It takes input in a hierarchical syntax inspired by the way you would write out the model mathematically, i.e. likelihood function > prediction function with parameters > priors for parameters. This is then translated into Stan code internally and rstan does the rest. You can find more here.

Since ulam is not capable of expressing a random walk prior for the parameter k, alongside other limitations, I am learning how to work directly in Stan via rstan. However, since I am still learning how to express models using Stan code, I was after some help with the outlined application. I have pieced together information from previous “random walk prior” questions on this forum (here and here) and it seems that the random walk should be specified in the transformed parameters block. Is this correct?

I have updated my MRE by changing the function to generate data to a double exponential following this paper to test if the random walk of k follows the expected exponential decay trend. I am specifying the random walk in the transformed parameters block which gives me one k parameter per step in x. From trial and error it seems that if the random walk is specified in the model block instead, predictions cannot be derived for each step in x. The way I specify the random walk follows the description in the paper I referenced in my original question:

Briefly, using the Library for Bayesian Inference, LibBi (Murray, 2013), the rate of detrital decay was described by a single first order differential equation ∂W/∂t = −kW in time, where k was the decay rate. However, this rate was formulated to change through time by replacing the simple deterministic model with a random walk stochastic process by setting the decay rate at any time to be a function of the decay rate at the previous time-step. This was done by taking k, previously a constant parameter, and replacing it by K(t) such that K(t + ∆t) = K(t) + r where r was normally distributed with mean 0 and standard deviation σr, and ∆t was the length of discrete time step. Using a Bayesian analysis, σr was treated as a parameter to be inferred and a prior distribution was prescribed. In the interest of informing the model during time-steps when no observations were available, we set a quasi-informative prior on σr to be a normal distribution with mean 0.1 and standard deviation 0.01. This was chosen to allow sufficient flexibility for the observations to influence the posterior while the assumption of Gaussian noise essentially imposes an element of smoothness, disallowing large jumps per time-step. To ensure the decay rate did not fluctuate between positive and negative numbers during the random walk, the solution was computed in log space and transformed back before solving. Length of time step ∆t was set to 1 day.

I am not sure if this is the typical way of specifying random walk and would appreciate some feedback. Here is the code:

# generate data
set.seed(10)
x <- 1:250
r <- rnorm(x, mean = 0, sd = x^-1)
c0 <- 2
a <- 0.01
b <- 0.07
k <- 0.01
y <- c0 * exp(-(a * x - b/k * (exp(-k * x) - 1))) + r
# the underlying function is a double exponential
# where the nested exponential function is integrated
# before being substituted into the first
# (see doi 10.1016/j.geoderma.2009.11.033)

require(tidyverse)
require(magrittr)
df <- tibble(x = x, y = y)

# plot data
ggplot(df, aes(x, y)) +
  geom_point() +
  theme_minimal()

# write Stan code
stan_code <- "
data {
  int<lower=0> n; // Number of observations
  vector[n] y;    // Response variable
  vector[n] x;    // Predictor variable
}

parameters {
  real<lower=0> sigma; // Standard deviation
  real<lower=0> a;  // Intercept
  real<lower=0> ks;  // Initial decay rate
  real r; // Random step
  real<lower=0> tau; // Precision of random step
}

transformed parameters {
  // Random walk
  vector[n] k; // Decay rate
  k[1] = ks; // Set starting value of random walk
  for(i in 2:n){
    k[i] = k[i - 1] + r; // Implement random walk for k
  }
}

model {
  // Priors
  sigma ~ exponential(1);
  a ~ lognormal(log(2), 0.5);
  ks ~ lognormal(log(0.08), 0.5);
  r ~ normal(0, tau);
  tau ~ lognormal(log(0.1), 0.2);
  
  // Prediction function
  vector[n] mu; // Mean prediction for y
  for(i in 1:n){
    mu[i] = a * exp(-k[i] * x[i]); // Calculate mu using updated k
  }
  
  // Likelihood
  y ~ normal(mu, sigma);
}
"

# draw samples using RStan
require(rstan)
require(tidybayes)
rw.mod <- stan(model_code = stan_code,
               data = compose_data(df), iter = 1e4,
               chains = 8, cores = parallel::detectCores())

# check Rhat, n_eff and chains
options(max.print = 1e4)
rw.mod
require(rethinking)
trankplot(rw.mod)

# extract prior and posterior distributions
rw.mod %<>% recover_types(df)
rw.mod.samples <- rw.mod %>% 
  spread_draws(sigma, a, ks, r, tau, k[x]) %>%
  mutate(sigma_prior = rexp(4e4, 1), # add priors
         a_prior = rlnorm(4e4, log(2), 0.5),
         ks_prior = rlnorm(4e4, log(0.08), 0.5),
         tau_prior = rlnorm(4e4, log(0.1), 0.2)) 

# prior-posterior comparison
rw.mod.samples %>%
  ungroup() %>%
  filter(x == 1) %>%
  pivot_longer(cols = c(4:8, 11:14),
               names_to = c("parameter", "distribution"),
               names_sep = "_",
               values_to = "samples") %>%
  mutate(parameter = fct(parameter),
         distribution = fct(ifelse(is.na(distribution), 
                                   "posterior", distribution))) %>%
  ggplot(aes(samples, fill = distribution)) +
    facet_wrap(~parameter, scales = "free", nrow = 1) +
    geom_density(colour = NA, alpha = 0.5) +
    theme_void() +
    theme(legend.position = "top", legend.justification = 0)

# summarise and plot predictions for k as a function of x
rw.mod.samples %>%
  summarise(k.mean = mean(k),
            k.pi_lwr50 = PI(k, 0.5)[1],
            k.pi_upr50 = PI(k, 0.5)[2],
            k.pi_lwr80 = PI(k, 0.8)[1],
            k.pi_upr80 = PI(k, 0.8)[2],
            k.pi_lwr90 = PI(k, 0.9)[1],
            k.pi_upr90 = PI(k, 0.9)[2]) %>%
  pivot_longer(cols = starts_with("k.pi"),
               names_to = c("limit", "percentile"),
               names_pattern = ".*_([a-z]+)([0-9]+)") %>%
  pivot_wider(names_from = "limit", values_from = "value") %>%
  ggplot(aes(x = x, y = k.mean, ymin = lwr, ymax = upr, 
             alpha = percentile)) +
    geom_line() +
    geom_ribbon() +
    scale_alpha_manual(values = c(0.5, 0.4, 0.3), guide = "none") +
    theme_minimal()

# summarise and plot predictions for mu
rw.mod.samples %>%
  mutate(mu = a * exp(-k * x)) %>%
  summarise(mu.mean = mean(mu),
            mu.pi_lwr50 = PI(mu, 0.5)[1],
            mu.pi_upr50 = PI(mu, 0.5)[2],
            mu.pi_lwr80 = PI(mu, 0.8)[1],
            mu.pi_upr80 = PI(mu, 0.8)[2],
            mu.pi_lwr90 = PI(mu, 0.9)[1],
            mu.pi_upr90 = PI(mu, 0.9)[2]) %>%
  pivot_longer(cols = starts_with("mu.pi"),
               names_to = c("limit", "percentile"),
               names_pattern = ".*_([a-z]+)([0-9]+)") %>%
  pivot_wider(names_from = "limit", values_from = "value") %>%
  ggplot(aes(x = x, y = mu.mean)) +
    geom_line() +
    geom_ribbon(aes(ymin = lwr, ymax = upr, alpha = percentile)) +
    scale_alpha_manual(values = c(0.5, 0.4, 0.3), guide = "none") +
    geom_point(data = df, aes(x, y)) +
    theme_minimal()

While the model seems to run fine and produce decent predictions for mu, I am very surprised to find k decrease linearly over x when the function used to simulate the data clearly models k using the integral of the function a + b * exp(-k * x), so the expected random walk should follow exponential decay over x in this example. From viewing plotted results of random walk, I am used to more noise, which is entirely absent from the linear prediction I get for k. Is my tau prior too restrictive?

Just tagging some of the pros I’ve identified from other “random walk prior” posts in the hope that they may have some time to have a look at this: @bbbales2, @Bob_Carpenter, @mike-lawrence, @jgellar.

Update
The authors of the paper have kindly provided their rbi code, rbi being the R interface to LibBi. I can now see that rather than using the exponential decay function, derived by integration, the authors use the ordinary differential equation (ODE) and solve it internally with an RK4 ODE solver, which I know is also possible in Stan.

model decay {
  state W, k, K
  noise r
  param sigma_r
  obs W_obs
  sub parameter {
    sigma_r ~ normal(0.1, 0.01)
  }
  const prop_std = 0.5;
  sub proposal_parameter {
    sigma_r ~ normal(sigma_r, 0.01*prop_std)
  }
  sub initial {
    k ~ normal(log(0.003), 0.8)
    K ~ log_normal(log(0.003), 0.8)
    W ~ normal(1.0, 0.1)
    25
  }
  sub transition(delta = 1.0) {
    r ~ normal(0.0, sigma_r)
    k <- k + r
    K <- exp(k)
    ode(h = 0.01, atoler = 1.0e-6, rtoler = 1.0e-9, alg = 'RK4(3)'){
      dW/dt = -K*W
    }
  }
  sub observation {
    W_obs~log_normal(log(W),0.1)
  }
}

Is there someone fluent in LibBi and Stan, who can translate this into Stan?