Hey all!
Problem Description
I have a dataset of measurements of biomass within an aggregation (of birds, but this doesn’t matter all that much) at multiple occasions of such aggregations throughout a season (a couple of months every year). I want to model the seasonal pattern of biomass change through time (measured in days) using an hierarchical Bayesian model.
The real biomass x_{ij} of each aggregation i will be a latent variable obtained from the measurement y_i (this conversion is done by using priors from a log-normal calibration model). I use j to index the days.
I need to obtain the daily sum, say X_{j} of real biomass (the latent x_{ij}), in order to fit a model (I’m going to fit the derivative of a logistic curve) to the daily sum of biomass and get the seasonal curve.
The first issue I’m facing is that I have a varying number of number of aggregations each day. On some days I might have one aggregation, and on others I might have ten.
The second issue is that my calibration model is a log-normal. I’m fitting a linear regression between \log(x_i) and \log(y_i). This means that I need to take the power before making the sum.
I tried to implement this using the code below (see end of post), but it returns the following error:
> DIAGNOSTIC(S) FROM PARSER
> Info:
> Let-hand side of sampling statement may contain a non-linear
> transform of a parameter or local variable. If it does, you need
> to include a target += statement with the log of absolute
> determinant of the Jacobian of the transform.
> Left-hand-side of sampling statement:
> X[j] ~ normal(...)
The Questions
I think this has something to do with a change of variables, and needing correct it with a Jacobian. But is this the correct way to do model the sum of latent variables? Can anyone think of another solution to this? If I’m on the right track, would you have a suggestion of where I can learn more about how to implement the variable change? I tried reading the User Guide, but I might need something a little more in-depth.
The Code
data {
int<lower=0> n_obs;
int<lower=0> n_days;
vector[n_obs] log_y;
// Vector containing the julian day of each observation:
int<lower=0> julian_day[n_obs];
// This vector has the label. If the label is 1, it will use the calibration model to fix the
// measurements. If the label is 0, no correction is needed.
int<lower=0> label[n_obs];
real alpha0_prior_mean;
real<lower=0> alpha0_prior_sd;
real alpha1_prior_mean;
real<lower=0> alpha1_prior_sd;
real<lower=0> sigma_prior_a;
real<lower=0> sigma_prior_b;
} // end data block
parameters {
// Parameters for correction of measurements:
real <lower=0> sigma_obs;
real beta11;
real beta12;
real<lower=0> sigma_process;
// Log-scale latent variable:
vector[n_obs] log_x;
// Logistic parameters:
real L;
real k;
real mid_point;
} // end parameters block
transformed parameters{
vector[n_obs] x;
vector[n_days] X;
// Transform latent variable to linear scale:
for(i in 1:n_obs){
x[i] = 10^log_x[i];
}
// This is my attempt at summing the biomasses on each day.
// It doesn't work (see text).
for(j in 1:n_days){
for(i in 1:n_obs){
if(julian_day[i] == j){
X[j] = X[j] + x[i];
}
}
}
}
model {
matrix[2,2] betas;
// Inherit priors from calibration model:
sigma_obs ~ gamma(sigma_prior_a, sigma_prior_b);
beta11 ~ normal(alpha0_prior_mean, alpha0_prior_sd);
beta12 ~ normal(alpha1_prior_mean, alpha1_prior_sd);
// Dummy prior for now, will change to something better at some point:
sigma_process ~ uniform(0.001, 100000);
betas[1,1] = beta11;
betas[1,2] = beta12;
betas[2,1] = 0;
betas[2,2] = 1;
for(i in 1:n_obs){
log_y[i] ~ normal(betas[label[i],]*log_x[i], sigma_obs);
}
L ~ normal(10^6, 10^4);
mid_point ~ normal(75, 20);
k ~ normal(1, 10);
//Likelihood
for(j in 1:n_days){
X[j] ~ normal( L*k * exp(k*(julian_day[j] - mid_point)) / (1 + exp(k*(julian_day[j] - mid_point)))^2, sigma_process);
}
} // end model block