Translating stan model to C++ code (how to inline "everything") - and related efforts using em++ to get Stan running in the browser

Hi,

I just did this :

and got something like this :

Is it possible to create a “FAT” bernoulli.hpp which is completely self contained ? Let’s call it bernoulliFAT.cpp.

Something like : gcc bernoulliFAT.cpp would produce the same binary output as this compiling / llinking command :


?

So, there are two commands that lead to the bernoulli executable:

Compiling

g++ 
-std=c++1y 
-pthread 
-Wno-sign-compare     
-O3 
-I src 
-I stan/src 
-I stan/lib/stan_math/ 
-I stan/lib/stan_math/lib/eigen_3.3.3 
-I stan/lib/stan_math/lib/boost_1.69.0 
-I stan/lib/stan_math/lib/sundials_4.1.0/include    
-DBOOST_RESULT_OF_USE_TR1 
-DBOOST_NO_DECLTYPE 
-DBOOST_DISABLE_ASSERTS 
-DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION     
-c  
-x c++ 
-o examples/bernoulli/bernoulli.o 
examples/bernoulli/bernoulli.hpp

Linking :

g++ 
-std=c++1y
-pthread -Wno-sign-compare    
-O3 
-I src 
-I stan/src 
-I stan/lib/stan_math/ 
-I stan/lib/stan_math/lib/eigen_3.3.3 
-I stan/lib/stan_math/lib/boost_1.69.0 
-I stan/lib/stan_math/lib/sundials_4.1.0/include   
-DBOOST_RESULT_OF_USE_TR1 
-DBOOST_NO_DECLTYPE 
-DBOOST_DISABLE_ASSERTS 
-DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION            
examples/bernoulli/bernoulli.o 
src/cmdstan/main.o   
stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_nvecserial.a 
stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_cvodes.a 
stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_idas.a  
-o examples/bernoulli/bernoulli

So, instead of the above ^ two commands. I would like to produce a BIG BIG BIG C++ file (called bernoulliFAT.cpp) that contains EVERYTHING as C++ source code ONLY and which can be compile+linked by a single command:

g++ bernoully.fat -o bernoiulli_from_fat

and then bernoiully_from_fat will be the same binary as examples/bernoulli/bernoulli

and will output this when executed by itself :

and if I type examples/bernoulli/bernoulli_from_fat sample data file=examples/bernoulli/bernoulli.data.R

then it outputs something like :

Is there an easy way to create such an inlined file ?

I was reading this : https://www.toptal.com/c-plus-plus/c-plus-plus-understanding-compilation

but … its not simple…

also, it seems that I can make a .js (or LLVM) runnable stan library (AFAIU) :

if I set this into the local file :

it even gives me a stanc file !

so, i have a stanc compiled by em++

and one compiled by g++ :

and, this also works :

but I just have not idea what to do with that bin/stanc

now, I am trying this :

which gave me this :

but somehow the resulting .js file cannot be run with node

it errors out…

anyway … i might come back to this at some point … still does not work :(

if you want to play around with this, here is the docker image that you can pull :

Cheers,

Jozsef

In general taking some subset of generated C++ code and putting in into a separate text file, and trying to compile it with a C++ compiler without proper headers, includes, and generally, well, making sure the file compiles, won’t compile.

If you really need to use a distribution outside of Stan you might consider writing your own sampler in C++ outside of Stan. Just copying and pasting a a subset of a generated C++ file won’t compile. Because it has so many dependencies in the Stan math library.

The Stan devs here mostly to answer questions regarding use of Stan modeling language, Stan development, Stan, modeling, etc.

This is more of a general C++ software dev question that could be answered by searching around elsewhere.

I agree, I had that feeling myself :).

However, the preliminary results are pretty encouraging.

I think I need to ask this on the em++ forum, or something similar.

1 Like

idk anything about emcc, but is their guide wrong that it’s not a drop in replacement for gcc/clang?

https://emscripten.org/docs/tools_reference/emcc.html#emccdoc

if you wanna hack it, pull this docker :

docker pull jhegedus42/welcome_the_lion_on_your_path

here is how to use the image

https://github.com/jhegedus42/wellcome_the_lion_on_your_path

it’s quite mess…

Is this project going anywhere? There hasn’t been any updates in the first repo for two months and now you’ve made a new repo that contains…what, a Docker tutorial?

Back when you made the first thread I was interested enough to try my hand at this. I made a bit of progress but eventually gave up for some dumb reason. Not really interested in continuing but I did finally get around to cleaning up my code and put it on GitHub. While you can fit models either on the server or in the browser I don’t think it’s possible to compile them in the browser.

2 Likes

Nice ! Interesting. Yeah, thats a good questions. Maybe it is not possible to compile them in the browser.

I put all my code etc… into that docker… so if somebody wants to try it … then it should work out of the box, on any machine… the new repo only talks about how to use the docker image.

Docker is nice because it is machine / environment independent. Whatever I do, can be reproduced out of the box. It’s basically 2 lines on any docker installed machine and then you have the same environment I have, no need to install anything … so the github repo only contains the scripts I use to start/stop the docker image… it’s work in progress… not perfect but at least if somebody wants to give it a try does not have to install 100s of dependencies… compilers whatnot… he/she will have exactly the same setup/version everything that I have… so there is not such thing is “ohhh but it works on my machine”.

Docker takes out the pain of setting up the environment.

I mean, just to set up my environment on osx took me a few hours.

Now with this docker image, on any machine , where there is docker, it takes 2 commands.

Hence the docker…

I was somehow under the impression that this is somehow obvious - so I did not write a novel about it - in that repo.

Basically, whatever this new thread has, was made inside that docker image.

If anybody wants to reproduce it then there are two lines needed to start the image.

  1. pull it
  2. start it

I just checked it, this is how you start the docker image :

Then you have everything to reproduce which I posted in this thread.

In other words, to have a fully functional setup for compiling Stan to JS (hopefully), you need to type 5 lines of code:

$ git clone git@github.com:jhegedus42/wellcome_the_lion_on_your_path.git
$ cd wellcome_the_lion_on_your_path
$ cd docker
$ docker pull jhegedus42/welcome_the_lion_on_your_path:0010
$ ./simple_start.sh

After this, you have an environment in which you can reproduce what I posted in this thread, and you don’t have to spend 1000 hours of setting up the right flags and whatnot for the right compilers, the right versions, bla bla bla …

Just type 5 lines of code and you can hack away… on playing around on how to compile Stan to JS.

I got this far…

I might come back to it later, but if anybody wants, now it only takes 5 lines of code to start to play with it… I hope this helps a bit.

I think Stan in the browser has some nice potential. Just imagine a stan.js file that you can include into a .html page and use Stan from any JS code … I think that is very awesome idea.

Hi,

Could you explain how to produce a .js file that I can run with node js, and does a simple linear regression fit ?

I looked into your code but I could not figure that out.

Something like:

 node fit.js

will do a linear regression fit, that was compiled from a Stan code into JavaScript.

Could you please help me out there ?

I think I just want to figure out how to do that first.

Baby steps.

Cheers,

Jozsef

I added a CmdStan-like interface that doesn’t need the web frontend, see cmdstan.js script. Compiling everything into a single file would also be possible but for now it uses worker threads which are more flexible.
Emscripten didn’t like TBB so make sure you have CmdStan 2.20 and not 2.21.

I see, but could you please show me how to compile the simplest Stan program from scratch that runs on node.js ?

I am not interested in the web server, and all the fancy stuff. I am simply interested in understanding how one can compile Stan to js. Nothing more.

More specifically, run the bernoulli example on Node. Not in the browser, just on node, a single .js file.

How can I do that ?

Apparently you did manage to do that, right ?

I mean, the fitting runs in the browser and not on the server, right ?

Cheers

Jozsef

There’s a bernoulli example -like precompiled model in v1/models/e0033a28167e65a1/. I used data modified from the golf case study for testing.

{"N": 19,
"P": 1,
"n": [1443, 694, 455, 353, 272, 256, 240, 217, 200,
  237, 202, 192, 174, 167, 201, 195, 191, 147, 152],
"k": [1346, 577, 337, 208, 149, 136, 111, 69, 67,
  75, 52, 46, 54, 28, 27, 31, 33, 20, 24],
"x": [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]}

You can run it with

node --experimental-worker cmdstan.js e0033a28167e65a1 golf.json

and it will produce output.csv.

Compiling a new model requires installing Emscripten and CmdStan. Then you can do

export CMDSTAN_PATH=/path/to/cmdstan
node cmdstan.js model_to_compile.stan

It will store the compiled model in v1/models/... and tell you the name of the new model.

All the web server stuff is in server/routes.js. Everything else in server/ works offline.

server/compiler.js has code that runs Emscripten and stores the compiled model.js and model.wasm files in v1/models/[model-id]/. The compiled model.js doesn’t do anything if invoked directly; it is meant to be instantiated as a worker thread. The thread controller lives in server/fits.js.
However, you could change the Module.onRuntimeInitialzed() function in server/worker.js to do something more than just set up the thread communication channel. In that case a newly compiled model.js should be able to run as a standalone script. I believe it only needs to have the model.wasm next to it.

2 Likes

WOW !

This looks really nice.

But how can I get from the CMDSTAN source code to a .js file ?

I mean, how did you create cmdstan.js in the first place ?

You wrote it yourself ?

Ok, so where is the source code from which model.js comes from ?

What do you mean by worker thread ?

I am bit lost here…

The only thing I would like to know how can I compile the CMDSTAN C++ code into .js files.

I looked in your github but have not seen any C++ code there… so I got really confused, where is Stan coming from if it is originally written in C++ ?

Here cmdstan.js just invokes the native cmdstan executable. There’s no point doing the Stan-to-C++ step in Javascript when you can’t do C++ -to-WASM step without the system em++.

Worker threads are a Node API https://nodejs.org/dist/latest-v12.x/docs/api/worker_threads.html

model.js is build from worker.js, stanlib.js and worker.cpp, see the run_emcc function in compiler.js.

Hmmm… so what does the native cmdstan output then ?

This is where I am a bit lost ?

Can the cmdstan executable itself not be compile to 100% pure JS ?

I mean, to get completely rid of anything native, and run everything in JS ?

I was thinking that is the point in compiling C++ to JS ?

For example here, the WASM part is always needed ? https://github.com/jhegedus42/stan2js/tree/master/emscripten_tutorial/hello_world
?

I cannot even compile a simple C++ hello world file to pure JS ?

Without an WASM magic ?

“you can’t do C++ -to-WASM step without the system em++”

Is it not possible to use some .js compiler that compiles the C++ code inside the browser ?

So you say you cannot compile C++ to JS in the browser ?

What if you compile em++ to JS, and then run that in the browser ?

I have just installed the latest em++ :

and also “compiled” the bernoulli example with cmdstan v.2.20.0…

what do I now, at the command line, using em++ to get a .js file, which just
runs in node.js ?

I don’t want to do any magic… I don’t really understand your cmdstan.js … and all that extra magic …

It is dead simple what I want.

I want to generate a bernouilli.js using em++ only and cmdstan source code such that if I rune node bernoulli.js it will be exactly the same as if I were running the binary produced by stanc + C++ compiler. Could you please explain how to do this extremely simple and straightforward task ? Without any magical, unneccessary .js and WASM and web workers, and threads and such … what not else stuff all over the place ?

I wanna to baby steps and understand what I am doing.

All this magic is too complicated for me. Could you please simplify your solution to this absolute minimalistic problem ?

I am sure this just something very simple, OR not ???

I cannot simply replace the compiler in the last step to em++ , here:

--- Translating Stan model to C++ code ---
bin/stanc  --o=examples/bernoulli/bernoulli.hpp examples/bernoulli/bernoulli.stan
Model name=bernoulli_model
Input file=examples/bernoulli/bernoulli.stan
Output file=examples/bernoulli/bernoulli.hpp

--- Compiling, linking C++ code ---
clang++ -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare      -O3 -I src -I stan/src -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.3 -I stan/lib/stan_math/lib/boost_1.69.0 -I stan/lib/stan_math/lib/sundials_4.1.0/include -I/usr/local/opt/gettext/include    -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION     -c -include-pch stan/src/stan/model/model_header.hpp.gch -x c++ -o examples/bernoulli/bernoulli.o examples/bernoulli/bernoulli.hpp
clang++ -std=c++1y -Wno-unknown-warning-option -Wno-tautological-compare -Wno-sign-compare      -O3 -I src -I stan/src -I stan/lib/stan_math/ -I stan/lib/stan_math/lib/eigen_3.3.3 -I stan/lib/stan_math/lib/boost_1.69.0 -I stan/lib/stan_math/lib/sundials_4.1.0/include -I/usr/local/opt/gettext/include    -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION    -L/usr/local/opt/gettext/lib                src/cmdstan/main.o stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_4.1.0/lib/libsundials_idas.a  examples/bernoulli/bernoulli.o -o examples/bernoulli/bernoulli
Jozsefs-MBP:cmdstan joco$ cat examples/bernoulli/bernoulli.hpp

All those libraries are already compiled to byte code, right ?

So there is no way that em++ will be able to create a pure .js file if I just try to compile this last bit, right ?

For example, how can I create a single standalone stan.js file that will replace stanc binary here :

--- Translating Stan model to C++ code ---
bin/stanc  --o=examples/bernoulli/bernoulli.hpp examples/bernoulli/bernoulli.stan
Model name=bernoulli_model
Input file=examples/bernoulli/bernoulli.stan
Output file=examples/bernoulli/bernoulli.hpp

Say, I just write, node stanc.js -o=examples/ ....

How can I create that stanc.js ?

100% pure Javascript. Now WASM. No magic. No thing.

Simple C++ to Javascript transpiling.

I really don’t want to overcomplicate things, as it is in your github repo.

I try learn the baby steps first.

Plain old em++ and just the simplest, purest, plain old stanc.js , or/ and whatever will produce in the end the bernoulli.js which can be run with node bernoulli.js , end it will just do exactly as the native Stan binary would do.

Could you please explain it to me how to do this very simple thing ?

Your magical cmdstan.js things and WASM and whatnot is just too magical for me… and whatever workers and such…

Is it possible to just have a plain old Javascript file, that will reproduce the bernoiulli example ?

That is really my major, first goal.

I was not able to figure it out yet how to do it.

My idea is to just have one single .js file, which can be included into any browser and it will run some sort of Stan calculation in the browser. As a first step. Then we can see how to feed in some extra data.

Cheers,

Jozsef

SORRY FOR TYPOS !!!