I have observations of a quantity, with each observation labelled by “agent” and by “time”. Enough data that I have many samples for each agent, and many samples for each time. But most times do not contain a sample for all agents, and the subsets of players sampled at each time are unique to each time.
My belief is that the data from each agent is characterized by a normal distribution whose mean and standard deviation are fixed through time, but that these characteristics vary across players. (I’m comfortable setting priors on the distribution of means across agents as well as the distribution of standard deviations across agents.)
I also believe that the correlation structure between agents is fixed through time and remains valid for any given time regardless of which subset of agents I actually have data about for that time. I’m also comfortable setting a prior on the full correlation matrix. (At least if implemented directly, but could use a poke in the right direction if the implantation should really be done with Cholesky factors.)
My question is how to reasonably code this model.
Essentially, for each time, my data is multivariate normal and characterized by the mean and std parameters for each “observed” agent as well as the portion of the whole correlation matrix that applies to the “observed” agents for that time.
Is the best way to handle this to simply code up a for loop (over times) in the model section and (within the loop) declaring a “small” correlation matrix that drills down into the full correlation matrix that is being sampled (filling in the elements one at a time using more for loops over the two dimensions of the matrix)? Is there a good way to incorporate cholesky factors into this discussion to speed up the execution?
For rough numbers: maybe 1,000 to 10,000 total data points representing 20-200 agents and 50-500 times.