I am trying to figure out if it is possible, and acceptable, to fit two models using the same outcome variable, with one shared parameter.
If I understand correctly, the problem would that keeping the two likelihoods independent of each other would bias the log likelihood, is that right?
If the assumption of the model is that some of the likelihood is explained by model A and some is explained by model B, I think some degree of mixing is required. Implementing both as if they were independent likelihoods but using the same data twice would only tell you how well they individually explain the same data – that is, there wouldn’t be anything joint about it. However, in your second formulation, it looks like there is a hard assumption that they equally contribute to the explanation of the data, which may not be the case. I am not an expert by any means in model weighting in this way, but I would think that the degree to which each models is an individually contributing explanatory entity for the same data would, itself, be a free parameter of the model over which you have some prior assumption. In this case, you would maybe have a (0,1) uniform prior, or a beta distribution with whatever shape you assume the prior should have (such as alpha = beta > 1 for an expectation value of 0.5). I am hopefully drawing some attention to your question so that someone more senior than myself can answer, but this is my intuition about your question.
What you’re saying about the mixture model is what I was also thinking, and depending on the goal of the model I agree it would typically be good to treat the mixture ratio as a parameter.
I’m interested in what you’re saying about using the 2 models independently.
That would indeed not make them joint, but in fact that is actually what I need for the model I am working on.
That a parameter is a correction for false negative results, and theta is the “true proportion” that must be used to correctly estimate a. That a is then used to get an unbiased estimate of mu (a regression component in my model). The problem is that when I do this in a mixture model approach, a is not estimated correctly because of unidentifiability with mu.
For that reason I kept the two models separate, but I was told that this violates the basic statistical assumption that a variable should not be used more than once as an outcome variable in the same model, or even in two separately fit models.
But based on what you’re saying, am I understanding correctly that you believe this is actually not a problematic to do, if that is what is required to correctly model the data?
I think it instantiates your evidence twice, which is not strictly kosher, and will probably make model comparison impossible, but since they don’t interact in any way, I don’t know that it should harm the recoverability of either parameter. Afterall, you’re just adding their marginal posterior densities together to drive the model forward. But if you need them fitted separately, I don’t really see a reason, prima facie, to run them in one model instead of as two completely separate programs.
That’s what I was thinking, yes. Problem is I am having trouble convincing a reviewer of this, and I’m looking for a way to properly argue it, if it is indeed ok to do this.
Brief addition/clarification: the model includes another variable that informs theta. Theta should be treated as a well-informed parameter, while mu is a regression component with coefficients that need to be estimated.