I’m currently helping (actually not precisely currently, rather as of next week) set up a new internal consulting analytics group, and I would like to get good practices established from day one (previous experience tells me that if you don’t do this on day one, then it never happens).
I am thus interested if anyone has pointers to detailed discussion of best practice for this sort of thing.
I’m interested in things like the extent to which best general software engineering practice, e.g. model management, regression testing, continuous integration etc. apply, or can be modified to apply, etc.
Yes this is good (as are the other workflow papers produced by the Stan group), but it is good on workflow from an analytic point of view. What I’m currently searching for is anything on workflow from a software development point of view (if you like, what is the optimal way to use Git in an analytics project). There does not seem to be a lot of that around.
If we are talking basic Git strategies, it really depends on the structure of the group and the pace of the project. For me personally, I like the approach that we use in Stan Math and other stan-dev repositories (some more than others ): Every change (regardless of how small it is) is done through a PR and someone needs to review and understand it. That makes a project much less dependent on a single person. The process is time consuming and occasionally takes awhile to get a reviewer but definitely worth it. Stan Math is a software project, but I dont see a big difference if doing an analytics project for Git strategies.