Hi, I am attempting to implement a custom error model based on the model from
Rüger, N., Berger, U., Hubbell, S. P., Vieilledent, G., & Condit, R. (2011). Growth strategies of tropical tree species: disentangling light and size effects. PloS one, 6(9), e25330.
The underlying distribution is
y_i \sim \mathcal{N} (\hat{y}_i, \sigma_i),
where \hat{y} is the model’s predicted size,
\sigma_i = \sigma_{1a} + \sigma_{1b}\hat{y}_i + I_i\sigma_2
is a combination of size-dependent and size-independent error, and I_i\sim Bern(p_e) is the indicator function of seeing a ‘large’ error.
The (hopefully minimal) stan code I have built is
functions{
// Custom likelihood
real ruger_error_lpdf(real y,
real y_hat,
real sigma_a,
real sigma_b,
real sigma_large,
real p_large){
real sigma_small;
real lpdf;
sigma_small = sigma_a + y_hat * sigma_b;
lpdf = normal_lpdf(y | y_hat, sigma_small) * (1.0 - p_large) +
normal_lpdf(y | y_hat, sigma_large) * p_large;
return lpdf;
}
}
// Data structure.
data {
int N;
real y_0_obs;
real y[N];
int index[N];
}
parameters {
//Global level
real<lower=0> global_sigma_a;
real<lower=0> global_sigma_b;
real<lower=0> global_sigma_large;
real<lower=0, upper=1> global_p_large;
real y_0;
real beta_1;
}
// Modeling a single variable that is changing over time with constant rate.
model {
real y_hat[N];
for(i in 1:N-1){
if(index[i]==1){//Fits the first size
y_hat[i] = y_0;
}
y_hat[i+1] = y_hat[i] + beta_1;
}
// Likelihood
y ~ ruger_error(y_hat,
global_sigma_a,
global_sigma_b,
global_sigma_large,
global_p_large);
//Priors
y_0 ~ ruger_error(y_0_obs,
global_sigma_a,
global_sigma_b,
global_sigma_large,
global_p_large);
beta_1 ~lognormal(0, 1);
global_sigma_a ~lognormal(exp(0.0927), 0.03);
global_sigma_b ~lognormal(exp(0.00038), 0.03);
global_sigma_large ~lognormal(exp(2.56), 2.5);
global__p_large ~beta(1/2, 1/2);
}
I can generate data and apply a using the following R code:
library(rstan)
#Generate data and apply Ruger error model
beta_1_true <- 1
y_0_true <- 10
y_true <- seq(from=y_0_true, to=25, by=beta_1_true)
#Addint error
y <- y_true + rnorm(length(y_true), mean=0, sd=0.927) +
rnorm(length(y_true), mean=0, sd=0.0038) * y_true +
rbinom(length(y_true), size=1, prob=0.0276) * rnorm(length(y_true), mean=0, sd=2.56)
rstan_data <- list(
N = length(y),
y_0_obs = y[1],
y = y,
index = c(1:length(y))
)
#Fit model
model <- stan_model(file="MinExample.stan")
fit <- sampling(model, data=rstan_data,
iter=100,
chains=2,
cores=2)
I am getting the following error when I attempt to use this:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:
real[ ] ~ ruger_error(real[ ], real, real, real, real)
Available argument signatures for ruger_error:
real ~ ruger_error(real, real, real, real, real)
Real return type required for probability function.
error in 'model12155b8b7879_MinExample' at line 56, column 34
-------------------------------------------------
54: global_sigma_b,
55: global_sigma_large,
56: global_p_large);
^
57:
-------------------------------------------------
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model 'MinExample' due to the above error.
and where implemented in a stan file like 56 indicates an error.
I have attempted to match the custom pdf format from 20.5 User-defined probability functions | Stan User’s Guide,
and it is not clear to me how to have a single real value as the first argument rather than a vector for likelihood, when other likelihoods are implemented by passing the argument for a vector-valued data type.