I’m modeling percentage change in oxygen levels in the blood from a particular experiment. So my prior before seeing the data was an inverse gaussian distribution. But my data (response variable ) has some negative values. The family( ): Inverse.gaussian doesn’t take negative values. How should I go about this?
My min and max range of values is (-23,40).With a mean of 4 and a median of 3.5. (Also this is a repeated measures data).
Just to clarify things, the distribution of the data conditional on the parameters is not the prior. What you are looking for is a sampling distribution or likelihood. Now, on to the question proper: have you tried a Gaussian (normal)? Can you show an histogram of the data?
Yes, I tried it with Gaussian(normal) and it works.
May be how I understand the family function is incorrect…So, my prior to this data (before seeing the data ) was inverse Gaussian. @maxbiostat
Conceptually, there’s no such thing as a prior for data*. In a Bayesian analysis, we have the prior on the parameters \pi(\theta) and the likelihood f(x \mid \theta) which is the conditional distribution of the data x given the parameters \theta. The posterior distribution of \theta given x and your choice of likelihood/sampling distribution is
So, in your case you could say that your likelihood was an inverse-Gaussian with unknown parameters \mu and \lambda. But as you note, this likelihood is a poor choice because it gives probability zero to data that were actually observed.
*Sometimes we talk about a prior predictive distribution for x, but let’s leave that aside for the sake of clarity.
@maxbiostat some doubts: ( and thank you for giving your time,I’m able to now get more clarity)
How I understood was that, we provide prior over parameters which is usually specified as a distribution?? Like, those parameter values are taken from ‘those’ distribution…so what we specify as prior are in the end distributions??
2.I don’t understand the part that how is my likelihood inverse gaussian instead of may be a gaussian? and keeping the prior still inv.gaussian?
3.So, is it that for modelling we only need to look at the likelihood …and what happens to prior?
Nevertheless, I will try to answer them here, for completeness. I invite @martinmodrak@betanalpha@andrewgelman and others to complement/correct my statements.
The prior is a probability distribution over the parameters, \theta, which are by definition not observed.
The likelihood is the distribution of the data x conditional on the parameters. Here xare observed, and are shown in your histogram. So you need to choose a distribution that is compatible with the observed data. If you have negative values, you cannot use an inverse Gaussian because that distribution does not admit negative values.
No, we need to look at both prior and likelihood if we want to do a Bayesian analysis, but first we need to have a clear idea of what are data and what are parameters so we can specify the model correctly.
What you need to do right now is to take a step back, state your problem clearly and then we can proceed with model building. What is the scientific question you want to answer? What exactly did you measure? How many measurements were made? How many per unit/patient?
Hi–strictly speaking, p(y|theta) is the distribution of the data given the parameters. The likelihood is a function of theta that is proportional to the data distribution. Different data distributions can have the same likelihood, as we discuss in chapter 6 of BDA.