Modeling housing sales time series data

I am using brms to model housing sales data, but I am not sure how to model discrete time-series inventory data. Any feeback is greatly appreicated.

The data frame contains:

  1. project_name: string, a real estate development project
  2. sales: integer, the number of units sold for the project (e.g., 0, 1, 2, …20, ,)
  3. sale_month: date, e.g., 2019-12
  4. remaining_units: integer, the number of remaining units in the inventory for the project

Usually sales date are treated as binoimal or possion distributed. I may use zero_inflated_possion or zero_inflated_negbinomial, as there are quite a few zeros (no sales for certain month). For simplicity, let’s say the data is possion or binomial distributed.

My question is how to model the autocorrelation of sales. brm has an argument of autocor to set the p, and q for autocorrelation, but I am not sure how to correctly set p and q. is there a method like auto.arima in brms to detect the p and q for a multi-level bayesian model? A trick part is that different projects start and end at different months.

I am also not sure whether remaining_units (monotonically desreasing) affects the binomial model. For example, next month remaining units = this month remaining units - sales of this month.

b1 <- 
  brm(data = dt, family = binomial,
      sales | trials(remaining_units) ~ 1 + (1 | project_name),
      prior = c(prior(normal(0, 1), class = Intercept),
                prior(cauchy(0, 1), class = sd)),
      iter = 10000, warmup = 1000)
  • Operating System: MacOS Catalina 10.15
  • brms Version:
    Thanks for your time

I suggest having a look at prophet3 which produces Stan models and can do fancy footwork. If BRMS is a must then this thread may be useful to you:

You could also consider a log link poisson model with an autoregressive component and Fourier terms for seasonality. Manual should get you started if you’re happy to use Stan:

2 Likes

Thanks emiruz. Your suggestion is helpful.

The trick part of my current project is that there are multiple time series, as there are multiple real estate projects across the city, and the housing sales data for these projects are different.

I was planning to use brms because of the multilevel modelling part in non time series data. However, I am not sure whether brms or prophet support multi-level time series modelling such as using project as a random effect.

I don’t use BRMS myself so perhaps others will be able to give you more of a sense of it’s limits. I thought BRMS gave up on explicit time series models but I could be wrong.

You can do all the things you mentioned in Stan directly if that’s possible for you and Stan is easy and quick to learn.

Thanks for your suggestion. I am learning Stan. It is not easy for me:), but I really like it.

I just found a package forecastML is working on modeling multiple time series. It seems can address the issue I mentioned, but I have to check.
https://cran.r-project.org/web/packages/forecastML/vignettes/grouped_forecast.html

Thanks for your time.

1 Like

That’s interesting. I’m new here, and, TBH, I don’t understand what is this for. Can it help someone to sell or buy real estate?