Lunch tomorrow/chat + white paper about scaling models

Lendable, the firm I work for, is currently scoping some work that will involve fitting models to much larger datasets than we are currently. As a part of this, I’d like to put together a public white paper listing best practices/model design ideas to take into account when building production-scale models for transaction-level data.

If anyone is keen, a couple of us are going to head to lunch tomorrow after the dev meeting to discuss these things. I’ll take notes and post them here afterwards. Anyone else interested should feel free to join.

Jim

Thanks. I’ll be there.

Do you have any sort of agenda? Or can you think about a use case?

Daniel

Yes, this is of interest to me as well.

Eric

I would have been really keen to attend, but October is a pretty full month for me and for the days I can be at work, I can’t afford to take time off.

Will you update us with your initial thoughts/findings?

Sure thing. I wanted to cover:

  • Which function scale poorly/scale well? For instance, which likelihoods use an analytical derivative etc. This is probably information available elsewhere.
  • Modeling techniques to avoid when modeling large datasets. Observation-level transformed parameters, for example, eat a lot of memory.
  • Encouraged techniques. Scale-free parameters, informative priors, non-centered parameters.
  • When does VB “work”? Guidelines on model building if you want to be estimating.
  • GMO timeline?

Happy to include any other points that you might consider useful.

Jim

Wish I could’ve joined! Let me know how it goes. Also happy to chat on another time.

re:when VB works. I think the short summary is that it’s open research, and it depends on the specific VI method, in the same way that the set of problems Monte Carlo methods work on depend on the specific method.

re:GMO timeline. Whenever I get to finishing the experiments and Andrew and Aki greenlight putting the paper on arxiv. (Hard to put a specific time.)

Dustin