Hi All,
A stan newbie here. I am trying to fit a linear model with measurement error in the response variable which includes some data transformation (radiocarbon calibration). This is just an initial test for something more complex that I am planning to work on, but I am already struggling with performance issues at this stage…
I have a model written in nimble (see attached nimbleVersion.R
) and wanted to start developing its stan version (see below, also the attached stanVersion.R
). The nimble model works fine and takes five minutes for getting a decent mixing, and a correct inference of the true parameters. The stan version is instead stuck forever at
Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
Stan model:
// Interpolation Function
functions{
real linear_interpolation_v(real x, vector x_pred, vector y_pred){
int K = rows(x_pred);
vector[K] deltas = x - x_pred;
real ans;
real t;
real w;
int i;
real x1;
real x2;
real y1;
real y2;
if(x>x_pred[1] || x<x_pred[K]) reject("x is outside of the x_pred grid!");
if(rows(y_pred) != K) reject("x_pred and y_pred aren not of the same size");
//this is which.min()
i = sort_indices_asc(fabs(deltas))[1];
if(deltas[i]<=0) i -= 1;
ans = y_pred[i];
x1 = x_pred[i];
x2 = x_pred[i + 1];
y1 = y_pred[i];
y2 = y_pred[i + 1];
ans = y1 + (y2-y1) * (x-x1)/(x2-x1);
t = (x-x1)/(x2-x1);
w = 1-t;
ans = w*y1 + (1-w)*y2;
return ans;
}
} // end functions block
data {
int<lower=1> N; //Number of 14C dates
vector[N] cra; //Observed CRA
vector[N] cra_error; // Observed Calibration Error
vector[N] d; //independent variable (distance from source)
int<lower=0> tau; // Number of raw in the calibration curve
vector[tau] calBP; //Calibration curve
vector[tau] c14BP; //Calibration curve
vector[tau] c14_err; //Calibration curve
real prior_intercept_mean;
real prior_intercept_sd;
}
parameters {
vector<lower=0,upper=55000>[N] theta;
real alpha; // Intercept
real beta; // Slope
real <lower=0> sigma_model; //Regression Error
}
model {
// Declare temporary variables:
vector[N] sigma_curve;
vector[N] mu_date;
vector[N] sigma_date;
// Priors:
alpha ~ normal(prior_intercept_mean,prior_intercept_sd);
beta ~ normal(0,0.5);
sigma_model ~ exponential(0.001);
// Core Model:
for (i in 1:N)
{
theta[i] ~ normal(alpha+beta*d[i],sigma_model);
sigma_curve[i] = linear_interpolation_v(theta[i],calBP,c14_err);
mu_date[i] = linear_interpolation_v(theta[i],calBP,c14BP);
sigma_date[i] = sqrt (cra_error[i]^2+sigma_curve[i]^2);
cra[i] ~ normal(mu_date[i],sigma_date[i]);
}
}
Some simulated data:
library(rcarbon)
library(nimbleCarbon)
set.seed(123)
alpha = 3000
beta = -2
sigma = 200
N = 100
dist = runif(N,0,1000)
calendar_date = round(rnorm(N,alpha+beta*dist,sd=sigma))
cra = round(uncalibrate(calendar_date)$ccCRA)
cra_error = rep(20,N)
data(intcal20)
x.rcarbon = calibrate(cra,cra_error)
m=medCal(x.rcarbon)
d = list(N=N,
cra = cra,
cra_error=cra_error,
d = dist,
tau=nrow(intcal20),
calBP=intcal20$CalBP,
c14BP=intcal20$C14Age,
c14_err=intcal20$C14Age.sigma,
a=min(intcal20$CalBP),
b=max(intcal20$CalBP),
prior_intercept_mean=3000,
prior_intercept_sd=200)
And the model call + init:
init=function(...){list(alpha=3000,beta=0,sigma_model=200,theta=m)}
fit <- stan(file = "model.stan",data=d,init=init,chain=2)
Clearly there is something wrong - I know for loops tend to slow things down, but the difference in performance between nimble and stan seems to indicate that there is something else that I am missing… Any help or tips are greatly appreciated!
Enrico
stanVersion.R (2.9 KB)
nimbleVersion.R (1.8 KB)