Compiling CmdStan to https://webassembly.org/ - how to make "one large" C++ file for a model, a C++ file that contains "everything needed", including Stan and Math routines

image

so the basid c++ build goes well, on Mac, bernoulli example ran …

now maybe, what if I just replace c++ with escc … or what was its name :)

some of these ?

hmmm

but maybe i do that after a bit of sleep…

update, it seems to be compiling … something … and not exploding right away :

image

whatever that ever means …

image

katsotaan… as they say suomeksi :) in Helsinki… means … katsotaan :)

ok, it blew up …

image

i now try do a “make build” first … it cannot find stanc :)

i guess this loooks good ?

ok I get it, stanc actually should be run already in the browser

so, i need to run grab the bernoulli example from the C version of the stan code and then compile it
with emcc as well


i need to look into this emrun thing … and maybe enable that compiler flag…
stanc can be compiled but I cannot “run” it … well…

I can, in the browser, it does run and produces a file :

:)

lol

the C version worked out fine :

this is crunching away :

fingers crossed :

ok, it says it has been built :)

now, the next chapter is to find out what that really means …

later, aligator

well, this does not work :

since this wants to run in browsers (yes, compile stan to c++ in the browser :)))

but this works :

so basically, the Stan to C++ code translation should be carried out using the Stan that runs on my laptop, not in my browser … (or if you want to recompile the model in the browser then you use the .js version of stanc but then it should be called as stan.js or something)

I need to figure out how to do this :

image

with em++

ok… maybe later, maybe tomorrow… but, so far , there is a clean plan forwards… let see if it is a dead end… now, time to sleep, or something…

it seem that some funny bug is at work :

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120285.html


i guess i give this a rest… and check back later…

if someone is interested, then please ping me and I put up the files as .zip to my gdrive… this is still too experimental and github is not my strength…

i think i try this inside the latest ubuntu docker … and not on my macbook … if it works… then i can easily share the love :)

I guess the data can be hard-coded, but that is only minimally useful for showing that Stan can sample from the implied posterior distribution. It would be much more useful if people could put their models on the web and other people could supply their own data, but no one has figured out how to circumvent the sandboxing measures that prevent
doing so.

you mean client server architecture ?

i think i am pretty close to getting the bernouilli example runnning… in the browser

so, i would not be surprised if em++ would work also on js platforms … so , that means, if ppl give their models, then, EVEN ppl who download the app, they can compile the model from C++ into JS using em++ running on JS , and then sample from
that on their own client i got into som compiler bug… i will look into that later, nevertheless, it more or less seems to

BOTTOM LINE :

I have a browser on my mobile phone, dissconnected from the net. I have 1000 models and I want to try it out on 1000 different data. I have a mobile phone with a large battery. I am in the middle of the forest.

The proposed solution will make it possible for the person using the proposed code running it in the browser, on javascript (or “native” webassembly) on the mobile phone to “train the models” and sample from all the 1.000.000 combinations.

Without zero model pre-compiled for him/her to begin with.

Sounds too good to be true ? Let’s find out !

Let’s also say that he starts to write down those 1000 models onto a piece of paper when he/she arrives into the middle of the forest - at the center of the forest there is a big tree, which has big roots and big leaves. He sits down, leans back, against it, and starts the calculations. :)

2 Likes

If somebody wants to give it a try I put the files here : https://github.com/jhegedus42/stan2js .

Once I figure it out how to do it, it will be there…

Ok, so in that repo, the following eye candy works. This is a .cpp compiled to javascript.

This one :

So if this one ^ can be compiled to JS, I am hopeful, I can compile Stan too… it’s a bit of an exercise for me to do that, but in the coming months, I will be playing around with this “project”. I think it would be cool. I keep you guys posted. I think I need to read a bit the makefiles in Stan and how it works… then I can figure out how to compile it to JS… give me a few months :).

update :

there will be docker image for working on this project - a good one …

it will be an open,public docker image, pullable, free, MIT, reusable, and it will cure all the problems of the world

Upon further review, Chrome now has a File System API

i mean, the compiler can run in the browser to right ? i would be surprised if that were not the case… so, ppl put their data into the app, the stan code compiles to js with the new data and generates samples …

i was under the impression that it is self evident that a c to js compiler can compile itself to js … if it is written in c … but this is just a theoretical assumption

In theory, something like that could work. But you need the new file system API to pass data in and get CSV / JSON out.

New file system api compared to what ?

I mean, there is an SPA that runs in the browser, the user uploads the data into the browser’s memory, then the SPA compiles the data + Stan model to a .js file, which then will be executed.

I guess an SPA is allowed to recieve data from the filesystem, right ?

I am thinking about this because a few billion computers / mobile phones are running JS out of the box… this could give Stan quite a popularity boost. It would be a pretty nice way to advertise Stan.

For teaching, live examples, that can be run in the browser.

I mean, the real question is : will em++ compiler work in the browser too ?

I don’t see why it would not…

Cheers,

Jozsef

I doubt it; C/C++ compilers do not work in a vacuum. Setting up an environment that has all the required supporting infrastructure is going to be nigh-impossible.
I think the most promising path is first using Stan3 interpreter and then connecting it to WebAssembly JIT.
Yes, neither of those exists, but they’re at least the sort of things that could exist someday.

Hmmm… I think I see the point… this is an interesting problem.
Not trivial at all.

But… still, it does not seem to be so impossible :

this seems to be running in my browser…

or i might be wrong…

anyway… no high priority stuff but it would have been interesting to run Stan in the browser …

also, i am wondering, why is it so that the data itself has to be baked in into the generated executable ?

well, anyway… i am sure there are reasons for this … mostly software engeneering reasons i believe … C++ is not the easiest language to implement automatic differentiation in … where the “stuff” lives in a “monad” and can be changed at runtime … ok …

It isn’t. E.g. my wasm-stan discussed in the other thread works the same way CloudStan does; data is passed through file upload API.

OK, so, cmdstan produces a C++ code that does not depend on the data itself ?

Is that correct ?

So, no matter what the data is, the generated C++ file is always the same ?
Or is it baked into the generated C++ file?

Is it possible to create a C++ files from a Stan code (model), where the data is read from the file system by the binary which is compiled from the generated C++ code ?

I was under the impression that data is baked into the generated C++ files, and that’s the end of the story, so apparently that is not the case ?

Hm…, so basically, by web you mean server+client ? Or client only ?

I mean, if the .js file generation takes place on the server, then the data generation can work on the client / browser. This way the server would not be overloaded.

So, the really question is, is it possible to supply data to an already generated .js (describing a model) to sample data from - without the need to contact the server and hard-code the data into the generated .js file.

I think this is your current solition on github, afaik.

So, yes, indeed. Pre compiled Stan models into .js, where ppl can put their own data and see what comes out (without contacting any server), stuff runs entirely in the web browser.

I don’t really know what kind of sandboxing issue are you talking about ?

If I had a .js file with a model, could I add extra data to it ? Say, for example using Node.js ?

I mean, I don’t think it is difficult to compile a C++ file to .js using em++ that gets data pushed into it and prints it to the screen.

How is Stan different from this ?

Do you want to upload data into a javascript object ?

From the harddisk ?

Or what is the “sandbox” here ?

ok, it seem that Stan can be fully ported to the browser :

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/emscripten-discuss/wJGJuJhXkdo/veJbozBJDwAJ

https://binji.github.io/wasm-clang/

I just don’t really see the WASM here :

everything seems to be .js …