Hi Mc-stan, I’m trying again to improve my biogeochemical inverse modeling using stan.
There is a phenomena we can model like so.
\frac{dC}{dt} = A_t + R_t + M_t
A_t = u_t\beta_t
M_t = K_t \frac{(C_t - S_t)}{z}
I have observations of C, u, S and good prior knowledge of K and z.
I want to know R and \beta.
dt is short enough (half hourly) that I’m not worrying about trying to solved with with an ODE, assuming everything is static between time steps is fine. I have good reason to belive \beta and R change slowly over a day or so, but this is one of the reasons for the study. also R is constrained to be positive by definition.
I’ve been doing a lot of reading regarding state-space models which I believe this falls under, given that many of my parameters vary over time. Including Jeff Arnold’s page and many of the questions posted here.
Currently my model, which mostly works, and by work I mean reproduce my synthetic data’s parameters. Is a kinda multi-level model such that…
C_{t+1} = C_{t} + u_{t}\beta_j + M_{t} - R_i
\hat{C} \sim \mathcal{N}(C, \sigma)
\beta \sim \mathcal{N}(\bar{\beta}, \sigma_\beta)
\bar{\beta} \sim \mathcal{N}(0, 1)
R \sim \mathcal{N}(\bar{R}, \sigma_R)
\bar{R} \sim \mathcal{N}(0, 1)
\sigma \sim \mathcal{Cauchy}(0, 1)
\sigma_\beta \sim \mathcal{Cauchy}(0, 1)
\sigma_R \sim \mathcal{Cauchy}(0, 1)
...
The issues are:
This models ability to reproduce R from a known dataset is quite dependent on the size of j and i, that is to say the time-window.
It’s very slow.
I’ve none-centred the multi-level components, which did help.
I’m still pretty new to this sort of modeling, and the Bayesian state-space resources tend to be mostly about prediction rather than uncovering the “true” hidden variables.
I feel like a Kalman type model would perform much better, but having read though https://jrnold.github.io/ssmodels-in-stan/ I don’t really understand which bits I need to implement to get what I want.
I see there is now a gaussian_dlm_obs
sampling statement, Is this what I want? I’m struggling to find an example which fits with what I’m trying to do.
I also just saw, http://www.juhokokkala.fi/blog/posts/kalman-filter-style-recursion-to-marginalize-state-variables-to-speed-up-stan-inference/. Where would this sit compared to gaussian_dlm_obs
.
Similarly, I’ve tried to fit this with walker
, using a simplified version of the DGF with just \frac{dC}{dt} = A_t + R_t, substituting y
for C and x1
for u and using rw2
. But while it’s very quick to run, the parameters it returned do not look at all like my DGF.
I’m looking for examples closer to my use case, or perhaps explanation of why my case is actually similar to X.
Plus if anyone wants to weigh in on if what I’m attempting to do is reasonable.
Oh and any coding tips for something I’ve done stupid with the multi-level version is also much appreciated.
In particular, I’d be very interested to hear from @jrnold and @james_savage
// demo version
data {
int<lower=1> N; // number of samples
int<lower=1> N_day; // number of days
int<lower=1> N_filter; // number of filters
int<lower=1> day[N];
int<lower=1> filter[N];
vector[N] C_hat; // observed O2
vector[N] Cs; // observed O2 at surf
vector[N] u; // east-west velocity
real z_hat; // thermocline thickness
real dt; // time step
}
transformed data {
vector[N-1] u_avg; // u is semi-sinosoidal, improved estimate by taking average
for(i in 1:N-1){
u_avg[i] = (u[i] + u[i+1]) * 0.5;
}
}
parameters {
real<lower=0> sigma;
real<lower=0> R_mu;
real<lower=0> R_tau;
vector<lower=0>[N_filter] R_tilde;
real<lower=1> z;
real<lower=0> kz;
real u_mu;
real<lower=0> u_tau;
vector[N_day] u_tilde;
real init;
}
transformed parameters {
vector[N_filter] R_raw;
vector[N_day] beta_u;
real C[N]; // true oxygen
C[1] = C_hat[1] + init;
// non-centering
beta_u = u_mu + u_tau * u_tilde;
R_raw = (R_mu + R_tau * R_tilde) * 0.0001;
for(i in 1:(N-1)){
C[i+1] = C[i] +
(u_avg[i] * dt * beta_u[day[i]] * 0.0001) +
((0.00001 * kz * dt * ((Cs[i] - C[i]) / z))) +
-((R_raw[filter[i]] * dt));
}
}
model {
sigma ~ cauchy(0, 1);
init ~ normal(0, 0.25);
R_mu ~ normal(0, 1);
R_tau ~ normal(0, 1);
R_tilde ~ normal(0, 1);
u_mu ~ normal(0, 1);
u_tau ~ cauchy(0, 1);
u_tilde ~ normal(0, 1);
z_hat ~ normal(z, 2);
kz ~ normal(4.5, 1);
C_hat ~ normal(C, sigma);
}
generated quantities{
vector[N] R; // mmol m-2 s-1
vector[N-1] dC; // mmol m-3
vector[N] Mz; // mmol m-2 s-1
vector[N] adv_u; // mmol m-2 d-1
for(i in 1:N-1){
dC[i] = C[i+1] - C[i];
}
for(i in 1:N){
R[i] = R_raw[filter[i]]; // R flux = mmol m-2 s-1
Mz[i] = (0.00001 * kz * ((Cs[i] - C[i]) / z) ); // flux = mmol m-3 s-1
adv_u[i] = (u[i] * beta_u[day[i]] * 0.0001); // mmol m-3 s-1
}
}
and the DGF in R
# test of adv_fast.stan
require(ggplot2)
require(data.table)
out = list()
ti = 0:((60*60*24*30)+1) # month of data
dat = data.table(
time = ti,
hour = ti/60/60
)
freq = (max(ti) / (6*60*60)) / max(ti)
dat[, spring_neap := (max(ti) / (14*24*60*60)) / max(ti)]
dat[, spring_neap := 0.66 + (0.35 * sin(spring_neap * pi * ti))]
# east-west
dat[, u := 0.7 * sin(freq * pi * ti + (8*60*60))]
dat[, u := u * spring_neap]
dat[, u := u + spline(c(0, 928001, 1100000, 2500000), c(-0.02, 0.01, 0.02, -0.02), xout=time)$y]
dat[, dateTime := as.POSIXct(time, origin = "2015-06-01", tz = "UTC")]
R_spline = spline(as.POSIXct(c("2015-06-01", "2015-06-07", "2015-06-09", "2015-06-15", "2015-06-30"), tz="UTC"),
c(0.5, 0.5, 0.1, 0.5, 0.8)/(60*60*24), n=nrow(dat))
dat[, R := R_spline$y]
dat[R < 0, R := 0] # R can't be negative
# gradients, [mmol m-4]
dat[, beta_u := spline(as.POSIXct(c("2015-06-01", "2015-06-10", "2015-06-20"), tz="UTC"), c(2e-04, 1e-04, 2e-04), n=nrow(dat))$y]
dat$Cs = 320
initial_DO2 = 300
#### x
# x = dat[second(dateTime) == 0]
# x = dat[second(dateTime) == 0 & minute(dateTime) %in% c(0,10, 20, 30, 40, 50)]
x = dat[second(dateTime) == 0 & minute(dateTime) %in% c(0, 30)] # simulate at 30m intervals
x[, dt := unique(diff(time))]
true = vector(length=nrow(x))
true[1] = initial_DO2
x$k = 4.5e-5 # m-2 s-1 (0.45 cm^2 s-1 from palmer2008)
x$z = 15
# loop to generate true C
for(i in 2:(nrow(x))){
u_avg = (x$u[i] + x$u[i-1])/2 # m s-1
A_u = u_avg * x$dt[i] * (x$beta_u[i-1] + x$beta_u[i] * 0.5)
Mz = (x$k[i-1] * x$dt[i-1] * ((x$Cs[i-1] - true[i-1]) / x$z[i-1]))
R_ = x$R[i-1] * x$dt[i] # mmol m-3
true[i] = true[i-1] + (A_u + Mz) - R_
}
x[, true := true]
x[, C := rnorm(.N, true, 0.25)]
x[, day := 1 + yday(dateTime) - min(yday(dateTime))]
x = x[1:length(day)-1] # drop last day
ggplot(x) +
geom_line(aes(dateTime, true), color="red", alpha=0.5, size=1) +
geom_line(aes(dateTime, C), color="blue")
ggplot(x) + geom_line(aes(dateTime, u), color="red", alpha=0.5, size=1)
ggplot(x) + geom_line(aes(dateTime, R), color="red", alpha=0.5, size=1)
ggplot(x) + geom_line(aes(dateTime, beta_u), color="red", alpha=0.5, size=1)
x[, filter := cut(day, max(day)/3, label=F)]
dx = list(
N = nrow(x),
day = x$day,
N_day = max(x$day),
filter = x$filter,
N_filter = max(x$filter),
C_hat = x$C, # measured C
Cs = x$Cs,
z_hat = 15,
u = x$u,
dt = 1800
)
adv_fast = stan_model("adv_fast.stan")
fit= sampling(adv_fast, data=dx, chains=1, iter=2000, control=list(max_treedepth=12, adapt_delta=0.90))