"ServerStan" implementation language poll

Bob, I’ll send you some direct messages so we can keep this on topic.

HEY wait, it looks like the direct messages have been turned off on this discourse? I can’t figure out a way to do that.

Weird, direct messages are definitely enabled. (Just sent you one to test.)

lol so it sounds like you need python which doesn’t exactly fulfill the criteria we are looking for. Is py2exe one of those things that looks really nice at face value but is actually bad irl?

If we take one step back and look at the situation, requiring a python3
interpreter doesn’t seem much worse than requiring that make be
available (for compiling model-specific binaries). (Or does
CmdStan{Py,R} ship a copy of make?)

If getting rid of the python3 interpreter was a goal, however, I’m
persuaded it could be accomplished. The executable-maker ecosystem is
reasonably mature. I think PyInstaller is the current leader.

I still think the discussion of language is a red herring. If the
model-compilation story is bad, nobody is going to use the thing.

I agree with you that we want to get the process of using this thing to be completely seamless from a user’s perspective. It’s not obvious to me that Python’s build tools are easier to distribute and use than make; I’m on the other end of the spectrum where I’ve mostly used make and have little recent experience with the Python story there. In the absence of other information I’m happy to adopt a slightly watered-down version of your preference there.

A side note - I think we can solve @avehtari’s original complaint about CmdStanR lacking log_prob+gradient access with something like a “server-mode” for cmdstan itself - @rok_cesnovar do you have something about that written up somewhere? The basic idea is that would be a new command line option to a cmdstan-built model that reads the data and starts up a simple server listening for RPCs like sample and log_prob.

I think in my head the value-add of something like ServerStan on top of that would be the goals of HTTPStan (code modularity, basically?) along with completely encapsulating the toolchain required to build and run a Stan model end to end. So I think @ariddell and others here are right that the language doesn’t matter very much; there won’t be much code in it. What will matter is the install story - if it requires someone to download and install other packages manually then I think we’ve lost.

One reason the HTTPStan architecture isn’t attractive to me is that it still links the Stan model into the running Python interpreter. I’m not an expert on that process but I think that means that the user needs to correctly install and configure the exact(ish?) C/C++ compiler that was used to compile their current Python interpreter. Given that many users have multiple python interpreters and C compilers, this will never be easy. If @ariddell or others have a more precise statement of how easy or difficult that problem actually is I’d love to read it, maybe it’s not as bad as it sounds, but in my experience with Stan workshops it’s been a real pain in the ass on Windows at least. We’ve had Python users who don’t know R switch to R during the workshop for that reason alone.

Not true, I think that is only compatible compiler (not sure even if that is true)

Currently, only conda-workflow is “supported” on Windows (https://pystan.readthedocs.io/en/latest/windows.html), but yes I agree, for new conda users this might feel very confusing.

This depends how we “compile” the “binary”.

https://www.pyinstaller.org/
Pyinstaller (also py2exe etc I think) for example packs the whole python interpreter + code + support libraries to one compressed file/folder. With onefile option one .exe file is created, but it is not a compiled exe file. When executed, it will first unpack the whole file into the temporary folder, and after that execute the wanted python script.
Minimal filesize is around 10mb and adding some numerical libraries the size increases to 50+mb and this is without any Stan files). Some antivirus program might block that file.

http://nuitka.net/pages/overview.html
There is also nuitka which can do some magic and translates python code to C and then this is possible to be packed to an executable.

Given that Stanc3 is currently developed with OCaml, I think it probably could be a good option for server-stan lang.

1 Like

@seantalts Yeah, I do have some ideas & drafts for what I think would be a nice gradual approach to “server stan”. Will post it later today, just need to go over it again and add some details to clear up some of the misunderstandingst. I have not posted it anywhere yet, because I already have too many opened proposals that have not been closed and dont want it to seem I am trying to be a smart ass everyhwere and then not deliver on my promises :)

EDIT: didnt have time to finish this yesterday, hopefully over the weekend.

2 Likes

Hey now, CSV I/O has been a bottleneck and has been updated over time by various people!

1 Like

Ack, that comment was about our R-dump format and where I found it to be a bottleneck.

I’ve personally never found the CSV format in CmdStan to be a bottleneck. Where are you finding it to be?

make gets used for CmdStan. The specific things done by make for CmdStan could probably be replaced by something else.

My main concern is that we not tie ourselves down to binary interoperability with Python or R, which is where we’ve run into problems before. I want to move to C++17 and use all the Eigen and templating I want on the C++ side.

Mac OS X machine ship with Python 2.7.16.

@mitzimorris should know how hard it is to install Python 3.

I’d check the available OCaml networking and threading tools and that the people on the project who know OCaml have some interest in building web servers.

1 Like

Some examples:

There’s more, nothing you can’t work around (see linked threads) but I wanted to be pedantic about the past issues because they have, at times, been a time-sink for various people! (me included).

answering question:

does CmdStan{Py,R} ship a copy of make

thanks to @ahartikainen, CmdStanPy has function install_cxx_toolchain which has been implemented for Windows users.

as to how hard installing Python3 is - as annoying as any other software install and confusing because maybe it’s a conda thing, maybe not.

2 Likes

Thanks! That first case sounds like there’s room for improvement in our reader, but even 50s is a long time to read 2GB of data. The second case is presumably an R CSV reader issue (which @bgoodri said was very slow to deal with comment lines).

I think the cost to convert from high-precision ASCII is something like a factor of 20 or more over binary, so there’s definitely a lot of room for improvement. I can read 2GB of binary in my machine in under 1s.

Networked file systems tend to be even slower than local spinny drives, so that’s another area for concern.

I didn’t take that to be pedantic at all. I was really asking about where the bottleneck was becuase I have never experienced it.

The CSV itself is not the problem, neither are the comments themselves. The problem for all fast R CSV reader packages (and I believe Python packages as well - Mitzi or Ari would know more) is the #adaptation terminated and the inverse mass matrix and stepsize comments that are reported in comments between the header and sampling values. Those are difficult for fast readers. The comments before the header (the metadata) and the comments after the samples (timings) are not a problem at all.
So this:

lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,theta
# Adaptation terminated
# Step size = 0.822884
# Diagonal elements of inverse mass matrix:
# 0.417943
-7.1053,0.953597,0.822884,2,3,0,7.96837,0.364082
-7.27881,0.989561,0.822884,3,7,0,7.30662,0.390752
-7.11597,1,0.822884,1,1,0,7.27771,0.365875

If this info would be presented in some other form, then this would be universally readable.

rstan uses the utils::read.csv that has no problem parsing this but is really slow. In cmdstanr we use a package called vroom, which is a lot faster (see Lightweight interfaces - keeping it light - #9 by rok_cesnovar), but not so lightweight of a package. Vroom is able to parse this format, but requires a lot of info on the incoming CSV to read it correctly (ballpark number of lines, initial number of commented lines, etc). The fastest CSV reading R package data.table which is also lightweight, can not read this format.

The first thread has links to a huge range of solutions by various people and, yeah, there’s definitely room for improvement!

Are the mass matrix elements not embedded as comments? That was the original intent, as bad an idea as that was.

I think everyone agrees that mass matrix and step size should be taken out of this format. Until then, we could use a fast reader for the draws and a separate reader to just fish out the step size and inverse mass matrix.

the problem is that they’re embedded as comments in the middle of the goddamn data rows - you have the CSV header row, then, if save_warmup is True, the warmup draws, then comments, then the sampling draws. this imposes a line-by-line processing strategy.

even if the warmup draws aren’t saved, it seems that some readers don’t like comments anywhere except at the beginning of the file - @rok_cesnovar can correct me here

2 Likes

Yes, comments before or after the data are fine for any reader we tried. The ones between the header and data or between data rows cause issue for almost all fast readers (at least in the R ecosystem).

On python side, pandas can skip comments even between the samples.

But collecting the samples, means that the file needs to be iterated through a second time

E.g. here is the latest implemention in ArviZ

The C++ implementation of a reader I wrote in the thread above just collects everything in one pass and was basically as fast as the fastest other lib-based solutions. Hard to beat just plowing through the file in one pass. That code could’ve been shared across Python/R/etc… and we wouldn’t have to have inconsistencies across interfaces. It could be re-written to be pretty easily maintainable since the only libs it uses are standard ones and the only touchy parts were iostream stuff. I’m a big fan of having a single implementation for simple things.