# RStan Error: "Exception: variable does not exist... failed to create the sampler; sampling not done"

Hey everyone––I’m having trouble fitting a Stan model using RStan. I feel like there’s an issue with how I’ve formatted my data. My data was originally in a data.frame, as it was generated using a mechanistic model. I re-formatted my data as a list, but I’m new to this, so there’s a fair chance I messed up. Here’s the story so far (P.S. I’m sorry if this post isn’t formatted properly!):

``````library(rstan)
library(gdata)
library(bayesplot)

write("// Stan Model; code the actual model in Stan
functions{
real[] dZ_dt(real t, // Time
real[] Z, // System state {Parasite, Host}
real[] alpha, // Parameters
real[] x_r, // Unused data (?)
int[] x_i){
real P = Z[1]; // System state coded as an array, such that Z = (P,H)
real H = Z[2];

real r = alpha[1]; // Parameters of the system, in order they appear
real O = alpha[2];
real h = alpha[3];
real b = alpha[4];
real c = alpha[5];
real u = alpha[6];

real dP_dt = P*r - H*(O*P/1 + O*P*h); // Deterministic mechanistic model
real dH_dt = b + H*(c*(O*P/1 + O*P*h)-u);
return{dP_dt,dH_dt}; // Return the system state
}
}

data{
int<lower=0>N; // Define N as non-negative integer
real ts[N]; // Assigns time points to N (I think?)
real y_init[2];
real<lower=0>y[N,2]; // Define y as real and non-negative
}

parameters{
real<lower=0>alpha[6]; // Make all items in alpha non-neg
real<lower=0>Z_init[2]; // Initial population size non-neg
real<lower=0>sigma[2]; // Error term non-negative
}

transformed parameters{
real Z[N,2]
= integrate_ode_rk45(dZ_dt,Z_init,0,ts,alpha,rep_array(0.0,0),rep_array(0,0),1e-6,1e-5,2000);
}

model{
alpha[{1}]~uniform(0,10);
alpha[{2}]~uniform(0,1);
alpha[{3}]~uniform(0,60);
alpha[{4}]~uniform(0,100);
alpha[{5}]~uniform(0,1);
alpha[{6}]~uniform(0,1);
sigma~lognormal(-1,1);
Z_init~lognormal(log(10),1);
for (k in 1:2){
y_init[k]~lognormal(log(Z_init[k]),sigma[k]);
y[ ,k]~lognormal(log(Z[ ,k]),sigma[k]);
}
}",
"Stan_Model_TypeII.stan")

stanc("Stan_Model_TypeII.stan") # To check that we wrote a file (I think we did?)

Stan_Model_TypeII <- stan_model("Stan_Model_TypeII.stan")

# Squeezing the data into a form that Stan gets
N <- length(Stoch_Data_TypeII\$t)-1 # N is 1952 which makes sense bc length of DF is 1953
ts <- 1:N
y_init <- c(Stoch_Data_TypeII\$P[1], Stoch_Data_TypeII\$H[1]) # Initial states, P = 1; H = 18
y <- as.matrix(Stoch_Data_TypeII[2:(N+1),2:3])
y <- cbind(y[,2],y[,1]); # This worked, sick; where y[,1] is H, and y[,2] is P
Stan_StochData_TypeII <- list(N,ts,y_init,y)

# Fitting the data to the model
fit <- stan(file = "Stan_Model_TypeII.stan",
data = Stan_StochData_TypeII,
warmup = 500, iter = 1000, chains = 2, cores = 1, thin = 1,
algorithm = "HMC",
diagnostic_file = "TypeII_Fitting_Output.R",
seed = 1996, verbose = TRUE)
``````

And here’s the progress made on the model:

``````TRANSLATING MODEL 'Stan_Model_TypeII' FROM Stan CODE TO C++ CODE NOW.
successful in parsing the Stan model 'Stan_Model_TypeII'.

CHECKING DATA AND PREPROCESSING FOR MODEL 'Stan_Model_TypeII' NOW.

COMPILING MODEL 'Stan_Model_TypeII' NOW.

STARTING SAMPLER FOR MODEL 'Stan_Model_TypeII' NOW.
``````

And here’s what the error code reads:

``````Error in new_CppObject_xp(fields\$.module, fields\$.pointer, ...) :
Exception: variable does not exist; processing stage=data initialization; variable name=N; base type=int  (in 'model1b35f2189f6_Stan_Model_TypeII' at line 25)

failed to create the sampler; sampling not done``````

Take a look at this section:

``````data{
int<lower=0>N; // Define N as non-negative integer
real ts[N]; // Assigns time points to N (I think?)
real y_init[2];
real<lower=0>y[N,2]; // Define y as real and non-negative
}
``````

The data section tells you what Stan is expecting as named elements in the list. So, you probably have a data frame that I’ll call `dat` with three columns, `ts` and two other columns I’ll call `y1` and `y2`. I’m also guessing that `y_init` refers to the first entry in `y1` and `y2`. In that case, you’d want to construct a list as:

``````data_for_stan = list(
N = nrow(dat) - 1 #minus 1 bc we use the first row for y_init
, ts = dat\$ts[2:nrow(dat)]
, y_init = c(dat\$y1[1], dat\$y2[1])
, y  = as.matrix( cbind(dat\$y1,dat\$y2))[2:nrow(dat)]
)
``````

The elements of the data list should be named, such as in `list(N=N, ts=ts, etc...)`.

This worked! Thank you so much!

1 Like