Lunch tomorrow/chat + white paper about scaling models

James_Savage · October 5, 2016, 3:20pm

Lendable, the firm I work for, is currently scoping some work that will involve fitting models to much larger datasets than we are currently. As a part of this, I’d like to put together a public white paper listing best practices/model design ideas to take into account when building production-scale models for transaction-level data.

If anyone is keen, a couple of us are going to head to lunch tomorrow after the dev meeting to discuss these things. I’ll take notes and post them here afterwards. Anyone else interested should feel free to join.

Jim

syclik · October 5, 2016, 3:40pm

Thanks. I’ll be there.

Do you have any sort of agenda? Or can you think about a use case?

Daniel

ericnovik · October 5, 2016, 5:04pm

Yes, this is of interest to me as well.

Eric

Avraham · October 5, 2016, 5:55pm

I would have been really keen to attend, but October is a pretty full month for me and for the days I can be at work, I can’t afford to take time off.

Will you update us with your initial thoughts/findings?

James_Savage · October 6, 2016, 2:55pm

Sure thing. I wanted to cover:

Which function scale poorly/scale well? For instance, which likelihoods use an analytical derivative etc. This is probably information available elsewhere.
Modeling techniques to avoid when modeling large datasets. Observation-level transformed parameters, for example, eat a lot of memory.
Encouraged techniques. Scale-free parameters, informative priors, non-centered parameters.
When does VB “work”? Guidelines on model building if you want to be estimating.
GMO timeline?

Happy to include any other points that you might consider useful.

Jim

dustin · October 6, 2016, 5:56pm

Wish I could’ve joined! Let me know how it goes. Also happy to chat on another time.

re:when VB works. I think the short summary is that it’s open research, and it depends on the specific VI method, in the same way that the set of problems Monte Carlo methods work on depend on the specific method.

re:GMO timeline. Whenever I get to finishing the experiments and Andrew and Aki greenlight putting the paper on arxiv. (Hard to put a specific time.)

Dustin

Topic		Replies	Views
Renaming example-models Developers	7	1040	February 7, 2022
Generable: startup productizing generative models Publicity	2	1241	November 3, 2017
Stan: The Gathering, May, 13, 2021, 11 am EST Meetings	1	514	May 13, 2021
NeurIPS 2020 -- this week Events	2	834	December 9, 2020
Call for possibly-under-documented Stan tricks for benchmarking project Modeling techniques , specification , performance	0	423	May 27, 2021

Lunch tomorrow/chat + white paper about scaling models

Related topics