# Incorporating pre-computed grid of models into Stan model

I’m working on a model where the general procedure is looking for the parameters that best minimize the difference between a model based on those parameters and data.

My code is the following:

``````       real get_model(real logage, real zh) {
``````
``````            return -1.0;
}

real chi2_lpdf(matrix obs, real logage, real zh) {
vector[num_elements(obs[:,1])] tmp;
vector[num_elements(obs[:,1])] holder;
real ln_like;

for (i in 1:num_elements(obs[:,1])) {
holder[i] = 1;
}

for (i in 1:num_elements(obs[:,1])) {
tmp[i] = (obs[i,1]-holder[i])^2/obs[i,2];
}

ln_like = -0.5*sum(tmp);
return ln_like;
}
}

data {
int num;
matrix[num, 2] obs;
}

parameters {
real<lower=1.0, upper=1.1> logage;
real<lower=-0.02, upper=0.00> zh;
}

model {
real mflux;
logage ~ normal(0.5, 1);
zh ~ normal(0.5, 1);
obs ~ chi2(logage, zh);
}
``````

As you can see, `get_model` is just a dummy function right now. What it eventually needs to be is a function that can take the `zh` and `logage` parameters, find and return the model associated with those parameters from a pre-computed grid so that it can be used to compare to the observations (`obs` variable). I’ve searched through the manual looking for any information on how I might go about this but I didn’t find anything. Has anyone dealt with a similar problem? I’d appreciate any suggestions or if there’s a tutorial you can point me to. Thank you!

If the difference is measured in log density, you’re talking about maximum likelihood estimation.

To answer your question, you can pass in any kind of data you want as data then pass it to a function as one or more arguments.

Usually, we just build a statistical model, which is why there’s nothing in the manual. I don’t quite see what your’e doing with that function.

Thanks for replying, Bob. I’m new to both Stan and C++ so I appreciate any help as I get started.

To answer your question, you can pass in any kind of data you want as data then pass it to a function as one or more arguments.

Usually, we just build a statistical model, which is why there’s nothing in the manual. I don’t quite see what your’e doing with that function.

I’ve worked more on the model I’m trying to implement and so hopefully it’s clearer what I’m trying to do –

First I generate a lookup table, where the parameters `zh` and `logage` vary and for each combination of `zh` and `logage` there’s a physically motivated model called `ssp`

def generate_lookup_table():

``````logzgrid   = [-1.0, -0.1]
logagegrid = np.log10([1.0, 11.0])

sspgrid    = [ssp1[:,1], ssp2[:,1], ssp3[:,1], ssp4[:,1]]

return [logzgrid, logagegrid, sspgrid]
``````

I pass those grids and the data I’m trying to fit to Stan,

``````mock = np.loadtxt('mock.dat')
lookup = generate_lookup_table()

norm_data = {
'num':        len(mock[:,0]),
'obs':        mock[:,1:3],
'logzgrid':   lookup[0],
'logagegrid': lookup[1],
'sspgrid':    lookup[2]
}
``````

I’m trying to find the `logage' and`zh` that best describes the data,

``````    parameters {
real<lower=1.0, upper=1.1> logage;
real<lower=-0.02, upper=0.01> zh;
}

model {
logage ~ normal(0.5, 1);
zh ~ normal(0.5, 1);
obs ~ chi2(logzgrid, logagegrid,
sspgrid, num, logage, zh);
}
``````

To find the best values for `logage` and `zh` I need to retrieve the `ssp` that most closely matches the `logage` and `zh` at any given step.

``````    functions {

vector get_ssp(vector logzgrid, vector logagegrid,
matrix sspgrid, int num,
real logage, real zh) {

vector[num] ssp;

/*
Find the ssp for a given logage and zh -
look up which elements in the logzgrid
and logagegrid are the closest match to
the given logage and zh and return the
SSP associated with those values.
*/

ssp =
return ssp;
}

real chi2_lpdf(matrix obs, vector logzgrid,
vector logagegrid, matrix sspgrid,
int num, real logage, real zh) {

vector[num_elements(obs[:,1])] tmp;
real ln_like;
vector[num] ssp;

ssp = get_ssp(logzgrid, logagegrid,
sspgrid, num, logage, zh);

for (i in 1:num) {
tmp[i] = (obs[i,1]-ssp[i])^2/obs[i,2];
}

ln_like = -0.5*sum(tmp);
return ln_like;
}
}
``````

What I really need help with, is how to do the lookup in the `get_ssp` function. Do you have any suggestions about how I can retrieve the indices in the `logzgrid` and `logagegrid` arrays for the input `logage` and `zh` values such that I can use them to find the `ssp` associated with those values?

I’m still just not sure why you’re trying to do this kind of optiization within Stan. Usually, stan programs define parametric log density function (the parameter block defines the parameters). We can then optimize the parameters w.r.t. the data.

I’m not sure how you define closest. If there’s a real value, you can use the built-in sorting functions, or if you only need the best, just search. If I want the index of the max element of an array, I’d use:

``````// assumes size(y) > 0
int index_of_max(real[] y) {
int max_idx = 1;
for (n in 2:size(y))
if (y[n] > y[max_idx])
max_idx = n;
return max_idx;
}
``````

Works the same way for other comparisons than `>`. You may want to check that there’s at least one element in the array if you want to reuse this.