Bayesian ordinal regression model (Empirical Bayes ordinal regression model)

Dear all,
I have the 2 sets of data of Family Well-being Survey.
The first data is survey data in year 2011, while the second is survey data in year 2016. The list of variables involved in this study are :
Dependent variable : Satisfaction level of family well-being
Independent variable : Strata, Ethnic, Family Type, Education level, Family Relationship, Family Economy, Family Health, Family Safety, Family and Community, Family and Religiosity, Family and Housing and Environment.

I have applied classical ordinal regression for may data. Now, I would like to apply Bayesian ordinal regression to do a comparison between the two approaches.

I want to apply Bayesian ordinal regression (empirical Bayes ordinal regression ) using the two data set in the same time, where want to used the first data as prior (empirical Bayes) and second data to estimate the posterior.

I have applied rstanarm package for my data but i got results with an error.

So, could any one assists me how can we define the first data as prior and how can I use it to estimate the posterior. since in rstanarm package is not clear or maybe i could not understand very well?

I would be grateful if any one could reply me to share with him the data and see the reason of error.

Your cooperation is highly appreciated

1 Like

Just fit both datasets at the same time!

Breaking this up into two inferences would probably require you to get some sort of approximation on the first posterior anyway (and probably quite a bit of fancy coding).

Thank you Dr. Ben for your response.

Could you please guide me how to do that? I need an example to understand and apply my data accordingly.

I did this but I gut error. Also, I could not calculate the AIC and BIC of this model.

data=BayesOrdinal1

fit3 <- stan_polr(FWS1~ Strata1+Ethnic1+FamType1+Edu1+Income1+Fam1+Eco1+Health1+Safety1
+Community1+Religios1+Housing1+Edu_R+Income1_R+ Fam1_R+Health1_R+ Safety_R
+Community1_R+Religious1_R+Eco1_R+Housing1_R+Ethnic1_R,
data = as.data.frame (BayesOrdinal1),
method = “probit”,
prior = R2(0.2, “mean”),
init_r = 0.1,
seed = 12345,
algorithm = “sampling”) # for speed only

summary(fit3)
print (fit3, digits=3)
plot (fit3)
fit3[25]

Regarding to prior = R2(0.2, “mean”), what is this prior ( is Gaussian prior or uniform or another informative prior) would like to know what is this prior.

Because I want to apply two priors (non informative prior, and prior using the bast data).
Need your help, please.

Best,

I don’t know. I assume it’s an r-squared prior: https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1549100, Regression and Other Stories: Bayesian R^2

If you have a model that works with both your 2011 and 2016 data by themselves, just combine the two datasets into one (whatever that means in your case), and fit the model.

Start with something as simple as possible that works and add complexity. Like maybe:

fit3 <- stan_polr(FWS1 ~ 1, data = some_data)

And add complexity incrementally.

In general, don’t fix the seed (that’s for debuggin). Better to have your inferences start in random places. They should be robust to this.

I don’t need to combine them together.
I want to use the data of 2011 to estimate the prior the used 2016 to estimate the posterior.

If p(\theta | X_{2011}, X_{2016}) is the goal, then it’ll be easier to do it in one swing (by just writing down p(X_{2011}, X_{2016} | \theta)p(\theta)).

It is not easier to do the two inferences separately and use the 2011 inference as a prior for the second calculation.

This is because if you do the first inference, you don’t actually get p(\theta | X_{2011}), you get samples from that distribution. If you wanted to pass that in as a prior for the next inference, you’d need to get that distribution in closed form somehow. You don’t have that problem if you do both things together.

1 Like

OK, we can do p(\theta | X_{2011}, X_{2016}) , but which prior we will use? can we use non informative prior for the first run. and informative prior as a second run to do comparison.
Do we will use rstanarm package?

The same prior you would have used if you only had the 2011 data. The prior should represent your knowledge before you had the 2011 and 2016 data.

It’s probably easiest to do this wherever you run your 2011 regression.

That is sounds good.
Could please write an example code if it possible, to apply my data. I am still confusing.

Or let me know to share the data to do the first run.
Many thanks

Here’s a vignette on ordinal regression that should walk you through how to do this: https://cran.r-project.org/web/packages/rstanarm/vignettes/polr.html

1 Like

Thank you Dr. Ben

I have previously done this as follows

fit5 <- stan_polr(FWS1 ~ Strata1 + Ethnic1 + FamType1 + Edu1 +
Fam1 + Eco1 + Health1 + Safety1 + Community1 + Religios1 +
Housing1, data = as.data.frame(Ordinal1),
method = “logistic”, prior = R2(0.2, “mean”),
prior_counts = dirichlet(1), init_r = 0.1, algorithm = “sampling”)
fit5

However, my questions that need answers are:

How to calculate AIC for “fit5”?
How can we specify the prior or used uninformative prior.
How to allow the prior to be estimated through use of past data (2011) to estimate the posterior using data of 2016?

Highly appreciated

How to calculate AIC for Bayesian ordinal regression model in R (rstanarm packages)?

How can we specify the prior or used uninformative prior for Bayesian ordinal regression model in R (rstanarm packages)?

How to allow the prior to be estimated through use of past data (2011) to estimate the posterior using data of 2016 for Bayesian ordinal regression model in R (rstanarm packages)?