Hello,
I’m having some conceptual trouble in defining the likelihood distribution. Assume that I want my likelihood to be a measure of the closeness of my data to the parameters, that is I have available a distance metric that computes the distance between them. Is there a way to represent this as a likelihood in stan?
functions {
real likelihood(Y, real param){
real D = dist(Y,param) // assume a blackbox distance function
// how do I encode this as a distribution?
}
}
I am not sure I understand the problem you are having, or if I’ll explain something you already know, but in a sense the likelihood is indeed a measure of “closeness” between model and data. Unlike least squares’ distance or a loss function used in some machine learning methods the likelihood is a self-contained concept that is always tied to a statistical distribution. In other words, the likelihood is the probability of an observation given a statistical distribution with some parameters (from some model).
The easiest approach is usually to find a common distribution that can be used with your model and data (that’s what most models do), can’t you do that? If that’s not possible or desirable you’d have to see if it fits with a likelihood-based framework, or if you need something else (if you are just minimizing some arbitrary function you may not be able to use a Bayesian framework to sample the posterior distribution).
To your specific problem, it seems you may have some distance function that is not a distribution and you would like to use it. You can use anything as a likelihood, as long as it can be written as a distribution (by normalizing the density, ensuring it integrates to unity probability of its support, etc), without further details about the model it’s hard to know what kind of issue you really need to solve.