# Using estimated variables as index

We are trying to fit a model which requires using estimated variables as indices. We fit our model to ratings of different stimuli where the rating for each stimulus depends on the inferred value of the stimulus and some noise (sigma_U). As stimuli are very similar, we include a degree of uncertainty about which stimulus was actually seen. This uncertainty is known and defined as a categorical distribution over all stimuli, with highest probability for the stimulus that was actually shown. The inferred value on a given trial depends on the stimulus the participant believes he/she is seeing (pidx) and not the one shown (sidx).

From the STAN documentation and forum we understand that â€śparameters or transformed parameters cannot be integer or integer arrayâ€ť and therefore resorted to approximate the categorical distribution with a truncated normal distribution, N(sidx, sds).
The sampled value however needs to be used to index the stimulus value, for which we would have to transform the sampled parameter to an integer. According to the forum and manual this is not possible for parameters and in general not advisable.

To allow for indexing we use if else statements as a workaround to approximate an integer i.e. if estimated parameter >1.5 and <2.5 then select the value corresponding to stimulus 2. This allows us to run the model but results in serious fitting issues. Is there another way to implement this?

Here a minimal example

``````data {
int<lower=1> nStim; // number of stimuli
int<lower=1> nTrial; // number of trials
real<lower=0,upper=1> rating[nTrial]; // participant ratings
int<lower=1, upper=9> sidx[nTrial]; //stimulus shown;
real sds; //approximated SD of normal distr.
}

parameters {
real<lower=0> sigma_U; //updating noise
real<lower=0> sigma_C; //choice noise
real<lower=0, upper=1> value[nTrial, nStim]; // estimated value for each trial based on inferred stimulus
real<lower=1,upper=9> pidx[nTrial]; // stimulus participant thinks he/she sees
real<lower=0> vPred[nTrial]; //predicted rating on each trial
}

model {
for (n in 1:nTrial-1) {
pidx[n] ~ normal(sidx[n],sds) T[1,9]; // draw from possible stimuli
if (pidx[n]<= 1.5) {
value[n, 1] ~ normal((value[n, 1], sigma_U)
vPred[n] ~ normal(value[n,1],sigma_C)
} else if (pidx[n] <= 2.5) {
value[n, 2] ~ normal((value[n, 2], sigma_U)
vPred[n] ~ normal(value[n,2],sigma_C)
â€¦ // continue for all stimuli
}
}
}

``````
1 Like

Bob Carpenter has provided a workaround for promoting reals to integers:

``````functions {
int real_to_int(p) {
int k;
k = 1;
while ((k+1) < floor(p)) {
k = k+1;
}
return k
}
}
``````

But as stated in the thread, this may cause incorrect gradient calculations (the reason Stan doesnâ€™t promote reals to integers), impacting sampling and run times

And, although I donâ€™t know how `data` only containers restrict user-defined functions, you can use `to_int` inside of user-defined functions as long as the container type is declared as `data` only.

How many possible stimuli are there?

There are 9 possible stimuli i.e nStim = 9

1 Like

With 9 possible stimuli, you should estimate this model by marginalizing over the possible stimuli for each observation.