Multiple likelihoods with same outcome, shared parameter

bennyb · May 26, 2024, 3:06pm

Hi,

I am trying to figure out if it is possible, and acceptable, to fit two models using the same outcome variable, with one shared parameter.
If I understand correctly, the problem would that keeping the two likelihoods independent of each other would bias the log likelihood, is that right?

e.g.:

model{
	bernoulli_lpmf(y |  (1-a) mu);
	bernoulli_lpmf(y |  (1-a) theta);
}

And if so, would a mixture approach account for this correctly?

model{
	target += log_mix(0.5,
		bernoulli_lpmf(y |  (1-a) mu),
		bernoulli_lpmf(y |  (1-a) theta));
}

And then last, what is the correct terminology for describing this approach?
Joint likelihood?

Thanks for any insights,
Benny

Corey.Plate · May 28, 2024, 8:12pm

If the assumption of the model is that some of the likelihood is explained by model A and some is explained by model B, I think some degree of mixing is required. Implementing both as if they were independent likelihoods but using the same data twice would only tell you how well they individually explain the same data – that is, there wouldn’t be anything joint about it. However, in your second formulation, it looks like there is a hard assumption that they equally contribute to the explanation of the data, which may not be the case. I am not an expert by any means in model weighting in this way, but I would think that the degree to which each models is an individually contributing explanatory entity for the same data would, itself, be a free parameter of the model over which you have some prior assumption. In this case, you would maybe have a (0,1) uniform prior, or a beta distribution with whatever shape you assume the prior should have (such as alpha = beta > 1 for an expectation value of 0.5). I am hopefully drawing some attention to your question so that someone more senior than myself can answer, but this is my intuition about your question.

bennyb · May 28, 2024, 9:39pm

Thanks for your reply Corey, much appreciated!

What you’re saying about the mixture model is what I was also thinking, and depending on the goal of the model I agree it would typically be good to treat the mixture ratio as a parameter.

I’m interested in what you’re saying about using the 2 models independently.
That would indeed not make them joint, but in fact that is actually what I need for the model I am working on.
That a parameter is a correction for false negative results, and theta is the “true proportion” that must be used to correctly estimate a. That a is then used to get an unbiased estimate of mu (a regression component in my model). The problem is that when I do this in a mixture model approach, a is not estimated correctly because of unidentifiability with mu.

For that reason I kept the two models separate, but I was told that this violates the basic statistical assumption that a variable should not be used more than once as an outcome variable in the same model, or even in two separately fit models.

But based on what you’re saying, am I understanding correctly that you believe this is actually not a problematic to do, if that is what is required to correctly model the data?

Benny

Corey.Plate · May 28, 2024, 10:01pm

I think it instantiates your evidence twice, which is not strictly kosher, and will probably make model comparison impossible, but since they don’t interact in any way, I don’t know that it should harm the recoverability of either parameter. Afterall, you’re just adding their marginal posterior densities together to drive the model forward. But if you need them fitted separately, I don’t really see a reason, prima facie, to run them in one model instead of as two completely separate programs.

bennyb · May 28, 2024, 10:26pm

That’s what I was thinking, yes. Problem is I am having trouble convincing a reviewer of this, and I’m looking for a way to properly argue it, if it is indeed ok to do this.

bennyb · May 28, 2024, 11:54pm

Brief addition/clarification: the model includes another variable that informs theta. Theta should be treated as a well-informed parameter, while mu is a regression component with coefficients that need to be estimated.

model{
	bernoulli_lpmf(y |  (1-a) mu);
	bernoulli_lpmf(y |  (1-a) theta);

	beta_proportion_lpdf(y2 | theta, kappa);
}

Topic		Replies	Views
Is it ok to use the same independent outcome variable twice in a model? Modeling rstan , techniques	21	578	June 4, 2024
Multiple likelihoods Modeling	3	937	October 13, 2021
Modelling joint likelihood of two outcomes & parameter correlation Modeling rstan	1	531	August 26, 2022
Shared parameters in multiple models brms	0	548	January 31, 2022
Help calculating log-likelihood with binomial_logit_glm to use with loo Modeling fitting-issues , specification , loo	22	1257	August 3, 2022

Multiple likelihoods with same outcome, shared parameter

Related topics