Hi everyone,

I am a fairly new STAN-convert, and would like some help specifying a model. I feel like this one should be textbook as it is fairly standard analysis in in Neuroscience but while I’ve found several bits of information here and there, I am still struggling to make all of the pieces fit together.

# Background

I am analysing the results of an EEG experiment, during which 30 subjects performed ~100 trials (that lasted 10 seconds each) of a behavioural task.

For each trial/time_point, I am collecting a dependent variable (score)

The experimental conditions were:

- subject:
`1, ..., 30`

- trial_difficulty:
`EASY`

,`HARD`

- time of measurement within the trial:
`0s`

,`2s`

,`4s`

,`6s`

My data looks like this:

```
subject condition time_point trial_number score
1 EASY 0s 1 +0.53
1 EASY 2s 1 +0.32
1 EASY 4s 1 +0.12
1 EASY 6s 1 +0.05
1 HARD 0s 2 +0.26
1 HARD 2s 2 -0.19
...
```

### Notes

- Conditions were randomized for each subject across trials.
- Each subject performed about 100 trials, but the dataset is not fully balanced as there were more EASY trials than HARD trials.
- Similarity scores vary in the [-2;+2] interval.

# Scientific questions

I would like to answer the following two scientific questions:

- Is there an effect of trial difficulty (EASY vs HARD)?
- Is this effect present for all time points ?

# Model specification questions

I have tried to find inspiration in this excellent tutorial paper by Sorensen and colleagues [1] which is a bit similar to my paradigm.

However, I am struggling to properly take into account the `time`

variable, and specifically that there will be a serial correlation between consecutive time windows for a given trial.

# STAN model specification attempt

Here is what I have attempted so far, based on the example in [1].

Also, here I am afraid I am neglecting the serial correlation between time points.

```
data {
int<lower=1> N; //number of data points
real<lower=-2, upper=2> scores[N]; //scores
real<lower=-1, upper=1> difficulty[N]; //predictor
int<lower=1> J; //number of subjects
int<lower=1> K; //number of time points per trial
int<lower=1, upper=J> sub[N]; //subject id
int<lower=0, upper=6> win[N]; //time window
}
parameters {
vector[2] beta;
real<lower=0> sigma_e;
matrix[2,J] u;
vector<lower=0>[2] sigma_u;
matrix[2, K] w; //window intercepts, slopes
real<lower=0> sigma_w; //window sd
}
model {
real mu;
//priors
u[1] ~ normal(0, sigma_u[1]); //subj intercepts
u[2] ~ normal(0, sigma_u[2]); //subj slopes
w[1] ~ normal(0, sigma_w[1]); //time intercepts
w[2] ~ normal(0, sigma_w[2]); //time slopes
//likelihood
for (i in 1:N){
mu = beta[1] + u[1, sub[i]] + w[1, win[i]] +
(beta[2] + u[2, sub[i]] + w[2, win[i]]) * difficulty[i];
scores[i] ~ normal(mu, sigma_e);
}
}
```

Any help or pointers you can give to steer this in the right direction will be greatly appreciated.

Best,

Marco

# References

[1] Sorensen, T., Hohenstein, S., & Vasishth, S. (2016). Bayesian linear mixed models using Stan: A tutorial for psychologists, linguists, and cognitive scientists. *The Quantitative Methods for Psychology* , *12* (3), 175–200. doi:10.20982/tqmp.12.3.p175

[2] https://www.r-bloggers.com/fitting-bayesian-linear-mixed-models-for-continuous-and-binary-data-using-stan-a-quick-tutorial/