Recently, I’ve been working on estimating mixed logit model. However, as a parameter the intercepts are fixed.

I’m wondering how to set the priors for fixed parameters?

Also, if fixed parameters are set priors, the estimated posterior for the fixed parameters follow a distribution. In this aspect, what is the difference between fixed parameters and random coefficients, since both of them have posteriors that follow some distributions?

For example, assume that the model is y=\beta_0+\beta_1 x_1+\beta_2 x2+\varepsilon. The intercept \beta_0 is assumed to be a fixed parameter, but the parameters \beta_1 and \beta_2 are assumed as random coefficients. In this case, how to set priors for \beta_0?

If the prior for \beta_0 is assumed to follow normal distribution with a large variance, is the estimated posterior mean of \beta_0 regarded as the point estimate of \beta_0 ?

My biggest concern is that in Bayesian methods there are no fixed parameters because all parameters to be estimated have posteriors. In this case, how can we interpret and estimate fixed parameters?

If a model parameter (I’m going to call if coefficient in this context to avoid confusion with the Stan block definitions) is fixed putting a prior on it will not change the results (the prior probability p(\beta_0) will be constant whatever the values of the other coefficients since \beta_0 is constant), so it serves no actual purpose.

As @stemangiola mentioned, from a Stan perspective it is not possible to fix a variable in the parameter block, because by definition parameters listed there are allowed to change in value. If you want to use a fixed coefficient you can load it into the data block in Stan, and there’s no need to specify a prior for that.

The book, “Discrete Choice Methods with Simulation”, 2nd ed., by Kenneth E. Train, says something about estimating fixed coefficients using bayesian methods. See page 308, section 12.7.3, “Fixed coefficients for some variables”. It set priors for fixed variables.

Also, if we set the prior for a fixed coefficient as a constant (a value between 0 and 1), we implicitly assume that the prior is a uniform probability density.

I think you need to clarify to us what you mean by fixed effects. If you mean what @hhau described, which distinguishes “fixed” and “random” effects, or if you actually mean a coefficient that is fixed, as seemed to be the case from your model description, and as we initially understood.

I am not familiar with that book, so I’m not sure what is the context or their exact description, but I’m sure you’ll be able to clarify this point to us.

Assume that we are the God, and we create a model with true values of all coefficients known to us. We know that the value of one of the coefficients is fixed, \beta_0. In this case, we generate fake data using the model. And then we estimate the coefficients using the generated data to see if the coefficients estimates are close enough to their true values, which we know.

That is, the true value of \beta_0 is fixed, which is set by us. How can we estimate it using Stan? How to treat this fixed coefficient?

What I mean is the coefficient is fixed in its true value, which we create it and we should know.

Ok, I understand what you are saying (although you don’t really need “the God” to do this, 10 lines of Python should do it). And from what I understand it’s neither of the options from all of our replies, but a more philosophical question that comes up often when you are doing bayesian statistics.

The short answer is: in bayesian inference, we never worry about what the “one true value” of the parameter really is because what you will get is a distribution (the posterior), and that’s a good thing.
You can always compute a point estimate from it (the mean, median, or MAP estimate) that is more similar to that of frequentist approaches, that throw out the rest of the information available from the posterior.

However, it is important not to mistake point estimates for certainty, and especially “truth”. Whether bayesian or frequentist they are an approximation, and even in a simulation setting where the true value is known there is never guarantee that the inference method will recover that true value – that is true even with an exact method like least squares, because even if all assumptions hold the noise is random.

Thanks for your explanation. Now it’s clear to me that even if the true value of a coefficient is a constant, we can still set a vague prior for it and estimate its posterior, from which we can obtain its point estimate if we like. Thanks!

That’s right. In the best case scenario – if your model is well posed and the data is very informative – you will start with a vague prior and the posterior will be very precise around the “true” value.