Thanks for the reply @aakhmetz. rethinking::ulam
is an R function that I have used in the past to drive Stan. It takes input in a hierarchical syntax inspired by the way you would write out the model mathematically, i.e. likelihood function > prediction function with parameters > priors for parameters. This is then translated into Stan code internally and rstan
does the rest. You can find more here.
Since ulam
is not capable of expressing a random walk prior for the parameter k, alongside other limitations, I am learning how to work directly in Stan via rstan
. However, since I am still learning how to express models using Stan code, I was after some help with the outlined application. I have pieced together information from previous “random walk prior” questions on this forum (here and here) and it seems that the random walk should be specified in the transformed parameters
block. Is this correct?
I have updated my MRE by changing the function to generate data to a double exponential following this paper to test if the random walk of k follows the expected exponential decay trend. I am specifying the random walk in the transformed parameters
block which gives me one k parameter per step in x. From trial and error it seems that if the random walk is specified in the model
block instead, predictions cannot be derived for each step in x. The way I specify the random walk follows the description in the paper I referenced in my original question:
Briefly, using the Library for Bayesian Inference, LibBi (Murray, 2013), the rate of detrital decay was described by a single first order differential equation ∂W/∂t = −kW in time, where k was the decay rate. However, this rate was formulated to change through time by replacing the simple deterministic model with a random walk stochastic process by setting the decay rate at any time to be a function of the decay rate at the previous time-step. This was done by taking k, previously a constant parameter, and replacing it by K(t) such that K(t + ∆t) = K(t) + r where r was normally distributed with mean 0 and standard deviation σr, and ∆t was the length of discrete time step. Using a Bayesian analysis, σr was treated as a parameter to be inferred and a prior distribution was prescribed. In the interest of informing the model during time-steps when no observations were available, we set a quasi-informative prior on σr to be a normal distribution with mean 0.1 and standard deviation 0.01. This was chosen to allow sufficient flexibility for the observations to influence the posterior while the assumption of Gaussian noise essentially imposes an element of smoothness, disallowing large jumps per time-step. To ensure the decay rate did not fluctuate between positive and negative numbers during the random walk, the solution was computed in log space and transformed back before solving. Length of time step ∆t was set to 1 day.
I am not sure if this is the typical way of specifying random walk and would appreciate some feedback. Here is the code:
# generate data
set.seed(10)
x <- 1:250
r <- rnorm(x, mean = 0, sd = x^-1)
c0 <- 2
a <- 0.01
b <- 0.07
k <- 0.01
y <- c0 * exp(-(a * x - b/k * (exp(-k * x) - 1))) + r
# the underlying function is a double exponential
# where the nested exponential function is integrated
# before being substituted into the first
# (see doi 10.1016/j.geoderma.2009.11.033)
require(tidyverse)
require(magrittr)
df <- tibble(x = x, y = y)
# plot data
ggplot(df, aes(x, y)) +
geom_point() +
theme_minimal()
# write Stan code
stan_code <- "
data {
int<lower=0> n; // Number of observations
vector[n] y; // Response variable
vector[n] x; // Predictor variable
}
parameters {
real<lower=0> sigma; // Standard deviation
real<lower=0> a; // Intercept
real<lower=0> ks; // Initial decay rate
real r; // Random step
real<lower=0> tau; // Precision of random step
}
transformed parameters {
// Random walk
vector[n] k; // Decay rate
k[1] = ks; // Set starting value of random walk
for(i in 2:n){
k[i] = k[i - 1] + r; // Implement random walk for k
}
}
model {
// Priors
sigma ~ exponential(1);
a ~ lognormal(log(2), 0.5);
ks ~ lognormal(log(0.08), 0.5);
r ~ normal(0, tau);
tau ~ lognormal(log(0.1), 0.2);
// Prediction function
vector[n] mu; // Mean prediction for y
for(i in 1:n){
mu[i] = a * exp(-k[i] * x[i]); // Calculate mu using updated k
}
// Likelihood
y ~ normal(mu, sigma);
}
"
# draw samples using RStan
require(rstan)
require(tidybayes)
rw.mod <- stan(model_code = stan_code,
data = compose_data(df), iter = 1e4,
chains = 8, cores = parallel::detectCores())
# check Rhat, n_eff and chains
options(max.print = 1e4)
rw.mod
require(rethinking)
trankplot(rw.mod)
# extract prior and posterior distributions
rw.mod %<>% recover_types(df)
rw.mod.samples <- rw.mod %>%
spread_draws(sigma, a, ks, r, tau, k[x]) %>%
mutate(sigma_prior = rexp(4e4, 1), # add priors
a_prior = rlnorm(4e4, log(2), 0.5),
ks_prior = rlnorm(4e4, log(0.08), 0.5),
tau_prior = rlnorm(4e4, log(0.1), 0.2))
# prior-posterior comparison
rw.mod.samples %>%
ungroup() %>%
filter(x == 1) %>%
pivot_longer(cols = c(4:8, 11:14),
names_to = c("parameter", "distribution"),
names_sep = "_",
values_to = "samples") %>%
mutate(parameter = fct(parameter),
distribution = fct(ifelse(is.na(distribution),
"posterior", distribution))) %>%
ggplot(aes(samples, fill = distribution)) +
facet_wrap(~parameter, scales = "free", nrow = 1) +
geom_density(colour = NA, alpha = 0.5) +
theme_void() +
theme(legend.position = "top", legend.justification = 0)
# summarise and plot predictions for k as a function of x
rw.mod.samples %>%
summarise(k.mean = mean(k),
k.pi_lwr50 = PI(k, 0.5)[1],
k.pi_upr50 = PI(k, 0.5)[2],
k.pi_lwr80 = PI(k, 0.8)[1],
k.pi_upr80 = PI(k, 0.8)[2],
k.pi_lwr90 = PI(k, 0.9)[1],
k.pi_upr90 = PI(k, 0.9)[2]) %>%
pivot_longer(cols = starts_with("k.pi"),
names_to = c("limit", "percentile"),
names_pattern = ".*_([a-z]+)([0-9]+)") %>%
pivot_wider(names_from = "limit", values_from = "value") %>%
ggplot(aes(x = x, y = k.mean, ymin = lwr, ymax = upr,
alpha = percentile)) +
geom_line() +
geom_ribbon() +
scale_alpha_manual(values = c(0.5, 0.4, 0.3), guide = "none") +
theme_minimal()
# summarise and plot predictions for mu
rw.mod.samples %>%
mutate(mu = a * exp(-k * x)) %>%
summarise(mu.mean = mean(mu),
mu.pi_lwr50 = PI(mu, 0.5)[1],
mu.pi_upr50 = PI(mu, 0.5)[2],
mu.pi_lwr80 = PI(mu, 0.8)[1],
mu.pi_upr80 = PI(mu, 0.8)[2],
mu.pi_lwr90 = PI(mu, 0.9)[1],
mu.pi_upr90 = PI(mu, 0.9)[2]) %>%
pivot_longer(cols = starts_with("mu.pi"),
names_to = c("limit", "percentile"),
names_pattern = ".*_([a-z]+)([0-9]+)") %>%
pivot_wider(names_from = "limit", values_from = "value") %>%
ggplot(aes(x = x, y = mu.mean)) +
geom_line() +
geom_ribbon(aes(ymin = lwr, ymax = upr, alpha = percentile)) +
scale_alpha_manual(values = c(0.5, 0.4, 0.3), guide = "none") +
geom_point(data = df, aes(x, y)) +
theme_minimal()
While the model seems to run fine and produce decent predictions for mu, I am very surprised to find k decrease linearly over x when the function used to simulate the data clearly models k using the integral of the function a + b * exp(-k * x)
, so the expected random walk should follow exponential decay over x in this example. From viewing plotted results of random walk, I am used to more noise, which is entirely absent from the linear prediction I get for k. Is my tau prior too restrictive?
Just tagging some of the pros I’ve identified from other “random walk prior” posts in the hope that they may have some time to have a look at this: @bbbales2, @Bob_Carpenter, @mike-lawrence, @jgellar.