Using external data as input into STAN?

Hello,

I am very new to STAN and I have been trying to use it to estimate parameters for my system of ODEs (three variables). I am using it via R with rstan package, where I give the model data and initial estimates for paramters.

One of my parameters, pM, depends on two others via a fixed value (not a parameter to be estimated), called Lm. At the moment I have added Lm to the transformed data section as it is not to be estimated, I have my other two paramters to be estimated in the paramters section, and then the dependent paramter in the transformed parameters section. I would like to be able to input the Lm as an input, rather than set it manually. This is because Lm appears in my r code more than once, so if I vary it in R I need to remeber to then edit it manually in STAN too, and that could lead to forgetting and mistakes.

The STAN code currently looks something like this (some bits omitted):

// The input data is a vector ‘y’ of length ‘N’.
data {
int<lower=0> T;
array[T] real t;
vector[3] y0; //initial conditions
array[T] vector[3] y_obs;
}

transformed data {
real t0 = 0;
real L_m = 20;
}

// The parameters accepted by the model.
parameters {
//
real<lower=0> sigma; //2
real kappa;
real<lower=0> p_Amax;
}

transformed parameters {

//dependencies
real p_M = kappa*p_Amax/L_m;
}

model {
sigma ~ normal(0, 1);// 2
kappa ~ beta(3, 2); // 0.86
p_Amax ~ normal(0, 1); // 282

for (n in 1 : T) {
y_obs[n,1] ~ normal(y[n,1],sigma);
y_obs[n,2] ~ normal(y[n,2],sigma);
y_obs[n,3] ~ normal(y[n,3],sigma);
}
}

Hello @rhei17,
you could add it to the data block and provide it as you do for the other inputs. Just shift Lm from the transformed data to the data block and it should work as you intended.
Unless I missed a crucial part in the setup.
Good luck!

1 Like

Hi @danielparthier. Thank you for your suggestion. I tried doing that and I get a message of “Lm not in scope” when saving the .stan file.

The new version:

// The input data is a vector ‘y’ of length ‘N’.
data {
int<lower=0> T;
array[T] real t;
vector[3] y0; //initial conditions
array[T] vector[3] y_obs;
real Lm;
}

transformed data {
real t0 = 0;
}

// The parameters accepted by the model.
parameters {
//
real<lower=0> sigma; //2
real kappa;
real<lower=0> p_Amax;
}

transformed parameters {

//dependencies
real p_M = kappa*p_Amax/L_m;
}

model {
sigma ~ normal(0, 1);// 2
kappa ~ beta(3, 2); // 0.86
p_Amax ~ normal(0, 1); // 282

for (n in 1 : T) {
y_obs[n,1] ~ normal(y[n,1],sigma);
y_obs[n,2] ~ normal(y[n,2],sigma);
y_obs[n,3] ~ normal(y[n,3],sigma);
}
}

Not sure why that is the case…

You have declared a real data variable called Lm in the data block, but reference a different variable L_m in your p_M statement.

1 Like

I think you copied my variable declaration from the post and I took yours from your text without validating what you used in the code. As mentioned by @cgoold you should be fine by just changing it to be the same.

Hi, yes. That is a typo in the post actually. The code in STAN has the correct L_m. This is what I should have posted in the reply above.

// The input data is a vector ‘y’ of length ‘N’.
data {
int<lower=0> T;
array[T] real t;
vector[3] y0; //initial conditions
array[T] vector[3] y_obs;
real L_m;
}

transformed data {
real t0 = 0;
}

// The parameters accepted by the model.
parameters {
//
real<lower=0> sigma; //2
real kappa;
real<lower=0> p_Amax;
}

transformed parameters {

//dependencies
real p_M = kappa*p_Amax/L_m;
}

model {
sigma ~ normal(0, 1);// 2
kappa ~ beta(3, 2); // 0.86
p_Amax ~ normal(0, 1); // 282

for (n in 1 : T) {
y_obs[n,1] ~ normal(y[n,1],sigma);
y_obs[n,2] ~ normal(y[n,2],sigma);
y_obs[n,3] ~ normal(y[n,3],sigma);
}
}

And that is when I get a “not in scope” message.

I tried to compile your code and it fails due to a missing y in the data block. Once I added it it was fine. Is that what is also missing in your version?