# 'Observing the observer' models

Hello!

I’d like to use an ‘observe the observer’ model, in which a subject has an internal model of how stimuli might be generated (the perceptual model), which they invert; then based on the resulting posterior(latent variables| perceptual data) they respond according to a response model.
From the experimenter’s perspective, given the stimuli, subject responses and candidate models for the perceptual and response models that the subject is using (with their perceptual and response parameters), I want to get the posterior on the subject’s perceptual and response parameters.

Does STAN have a way of specifying this nested inversion? Or do I need to provide an analytical formula for the subject’s posteriors(latent| perceived stimuli)?

Thank you!
Elena

I am not familiar with this kind of models, but it should IMHO be possible in Stan. In general, as long as you would be able to write a simulator that generates synthetic data according to your model you should be able to write the model down in Stan. The only exception is when you have discrete parameters (unobserved variables), which Stan cannot handle directly (but there are tricks to get around this).

Not sure if I understood completely, but I think it is not possible in Stan unless you have an explicit formula for the posterior distribution of the inner model (since the inner model’s normalizing constant would probably depend on the parameters).

I understood the problem to be something like this (since this is probably wrong, you might get better answers if you explicate your model in equations):

Inner model:
p(\text{latent variables} \mid \text{perceptual data}, \text{parameters}) \\ \propto p(\text{perceptual data} \mid \text{latent variables}, \text{parameters}) p(\text{latent variables})

Outer model (contains the inner model as nested):
p( \text{parameters}, \text{latent variables} \mid \text{response}, \text{perceptual data}) \\ \propto p( \text{parameters}) p(\text{response} \mid \text{latent variables}, \text{parameters}) \\ \;\;\times p( \text{latent variables} \mid \text{perceptual data}, \text{parameters})

Stan requires you to be able to write down the unnormalized log probability of the outer model explicitly (where the unknown normalizing constant cannot depend on any of the parameters to be inferred).

I’m not sure if there is any robust probabilistic programming language that can handle this well (Anglican and Church might be able to do some… I haven’t tried). This might be a useful reference:
Tom Rainforth, Nesting Probabilistic Programs, UAI2018, http://auai.org/uai2018/proceedings/papers/92.pdf

I just came across this thread. Let’s say the experiment is that a person sees a series of values (e.g. 5, 7, 10, 3,…) and after every example they tell me what they think the mean of this series is. So if I made Stan code for a ‘Bayesian observer’ model I would put:

parameters{
real mu;
real<lower=0> stdev;
}
model{
mu ~ normal(0,10)
stdev ~ normal (0,10)
data ~ normal(mu,stdev);
}


And to get at the internal state of the observer after each sample (i.e. their estimate of mu and stdev), one could run that model again and again for e.g. sample 1 only, sample 1+2 , sample 1+2+3 etc. (or one could make the parameter mu to have as many entries as samples seen and also change the model code appropriately):

parameters{
real mu[nsamp];
real<lower=0> stdev[nsamp];
}
model{
mu ~ normal(0,10)
stdev ~ normal (0,10)
for (imaxSamp in 1:nsamp){
for (isamp in 1:imaxSamp){
data[isamp] ~ normal(mu[imaxSamp],stdev[imaxSamp]);
}
}


In contrast another common model (‘Reinforcement learning’ model) would express the learning/ combining the information about all the samples as:

mu[1] =0.5
for (isamp in 2:nsamp){
mu[isamp] = mu[isam-1] + learning_rate*(data[isamp] - mu[isamp])


So the conceptual difference between the models here is that learning_rate does not scale with the number of samples that have already been observed, whereas in the other model, later samples will produce a smaller shift in the estimated mu than earlier samples.

For the ‘Reinforcement learning model’ I would know how to fit it in Stan: learning_rate and some kind of ‘noise’ in the rating ability would be the only free parameter:

data{
real data[nsamp]
real participant_rating[nsamp]
}
parameters{
real<lower=0, upper=1> learning_rate
real rating_noise // assuming that participants have some noise in how they report their true belief
}
model{
real mu[nsamp]
mu[1] =0.5
for (isamp in 2:nsamp){
mu[isamp] = mu[isamp-1] + learning_rate*(data[isamp] - mu[isamp])
}
// assuming that participants
participant_rating = normal(mu,rating_noise)


But how would I fit the ‘Bayesian observer model’? I was wondering what would be actually a parameter that could differ between participants that produce something similar to the learning_rate? And I think that would be the prior belief about stdev (just focusing on this as an example, could also be the other parameters). Here is an attempt at a model but I think it’s not right (basically I don’t think i should have ‘stdev’ as parameter? and I’m not sure that ‘participant_rating’ is in the right place?):

data{
real data[nsamp]
real participant_rating[nsamp]
}
parameters{
real<lower=0> stdev[nsamp];
real<lower=0> stdev_prior;
}
model{
stdev ~ normal (0,stdev_prior)
for (imaxSamp in 1:nsamp){
for (isamp in 1:imaxSamp){
data[isamp] ~ normal(participant_rating[isamp],stdev[imaxSamp]);
}
}
}