Modelling a likelihood that changes based on auxiliary variables

Balexanderstats · January 21, 2021, 6:18pm

I am working on using STAN to implement a multilevel regression model on repeated measures survey data but the catch is that there are two auxiliary variables and not everyone was measured at every time. Basically, I’m combining two public opinion surveys with repeated measurements. The first auxiliary variable is denoted which time points an observation was measured. No observations are measured at every time point, but there are observations at every time point. The second variable denotes a demographic group. If an individual is measured more than once, the observations for those respondents are correlated and the correlation needs to be accounted for in the likelihood. For a fixed value of both auxiliary variables, the likelihood is normal or multivariate normal. The (co)variance of the error term in the regression depends on demographic characteristics.
Basically, my likelihood looks similar to mixture of normal and multivariate normals with and the distribution each observation belongs to is known. What is the best way to specify this likelihood in stan? I was considering writing a custom likelihood function. But if there some way I can avoid a custom likelihood by using indicator variables and if statements since conditional on auxiliary variables the likelihood is a standard distribution.

mike-lawrence · January 22, 2021, 4:58pm

If I understand correctly, this sounds like a scenario of repeated measures with missing data. Check out this lecture/tutorial on how I’d approach that scenario.

Balexanderstats · January 29, 2021, 8:36pm

This scenario isn’t exactly repeated measures with missing data. Although your approach would work in that scenario. The catch with this data set is that the gap between the waves in the surveys is different. One survey has two waves 6 months apart and the other has six waves one months apart. Also the focus on not on the individuals as much the population values. And the missing data isn’t random. It is missing due to the design of the survey, it’s missing because the person wasn’t in the other survey.

Topic		Replies	Views
Likelihood for multiple measurements Modeling specification	3	562	May 14, 2018
Is it valid to define likelihood on partial data? Modeling	4	869	November 19, 2017
Latent variable in two likelihoods Modeling techniques	4	231	June 15, 2024
Model Poll data (Categorical Likelihood with Dirichlet Prior) Modeling rstan	1	324	April 21, 2023
Likelihood for subsets of data Modeling	4	423	January 25, 2020

Modelling a likelihood that changes based on auxiliary variables

Related topics