Combining data from multiple sources to model the same parameter

Hi,
My question is not just related to Stan; it is also a statistical problem that I don’t fully understand, but I know that often some you modelers who are much more talented than me often write up detailed examples of modeling procedures and provide accompanying code in Stan (example). These examples have helped me out a lot, and I’m hoping that someone is interested in taking on this problem in a similar manner.
My problem is that I have two different datasets that provide information about a response variable, and I have no reason to believe that either dataset is better than the other (at least not in all instances). Here is a paper that deals with this issue. They outline a modeling procedure for dealing with multiple data sources, but if you look at the section METHODS: Shared model, they define their latent Z parameter using two different data sources (two different Y values). It seems to me like there is an identifiability issue here, but I have not found many other relevant examples of combining data sources to evaluate the same latent variable in the literature.
If anyone has the desire to write this up as a modeling/Stan example I would greatly appreciate it. I’m sure that others have a similar issue and they would appreciate this as well.
Thank you,
Mikey

1 Like

This is both fine and often a good idea, especially if the two measurements differ in their ability to inform different parameter estimates. If the two datasets are somehow interdependent you might need to model but that doesn’t sound like your situation.