Compiling CmdStan to https://webassembly.org/ - how to make "one large" C++ file for a model, a C++ file that contains "everything needed", including Stan and Math routines

Jozsef_Hegedus · August 1, 2019, 4:57am

I am thinking about compiling CmdStan (say the Bernoulli example) to https://webassembly.org/ .

This would be easiest to generate a “large” C++ files, that containes “everything”, which can be passed to the https://webassembly.org/ compiler. The problem is … I don’t know how to do that.
A huuuuuge chunk of C++ code, one laaarge C++ file, which, when compiled with a cc compiler,
gcc, clang, etc… will produce, say, the bernoulli example. Then that one C++ file could be piped into
the https://webassembly.org/ compiler which produces js code (or something similar) which can be
run on https://webassembly.org/ supported browsers.

Then, Stan models can be run in pretty much any https://webassembly.org/ supported browser.

That would be pretty cool ! (In case it has not been done already :) ).

If that could work then I could throw in StanScala into the mix and use Scala.js to make some
fun demo of Stan, that could be run in the browser only (or on a mobile phone, for that matter).

For educational purposes / cool interactive demos, etc.

I am writing a “toy” web framework, full Scala stack. If Stan could be added to it then
there would be a “demo” on how to make some SPA (in Scala/Scala.js/ScalaStan,
fully type safe => awesome, easy, cool ) where someone can do some real time MCMC fitting
of some model on their mobile phone, to any model they can make in ScalaStan).

This would make it possible to use Stan in Cordova applications
(if https://webassembly.org/ is compatible with Cordova, or some other
SPA into phone-app converter framework), i.e. to make some mobile phone apps
where Stan is part of the app, just take the generated .js and put it into Cordova
and then publish it as an iPhone app, or Android app, or whatever platform Cordova supports.

Then you can use Stan to build some fun applications, educational, or say, using the
magnetometer/accelerometer, etc … built into the phone, calculate some sort of sensor
fusion “something” (I have seen a demo on such thing a few months ago, a researcher at
Aalto was using MCMC + Maxwell equations to do some sort of online learning of magnetic
field / navigation inside some building/office). I think he was using some custom made MCMC
algorithm. Custom code, etc.

With this https://webassembly.org/ + Cordova approach, one could use Stan to process sensor
data without writing custom code for Android/iOS etc … just use Scala for everything. Full
stack Scala Android/iOS apps that process sensor data gathered by mobile phones (including
camera, gps, accelerometer, magnetometer, you name it).

Might be useful for the community, and it might be just fun for it’s own sake. A good learning tool.

Bottom line, if someone knows how to make ONE LARGE C++ file, say, for the bernoulli example,
from the CmdStan codebase, then please let me know, right now it seems to be making a C++
code that uses precompiled dynamic Stan librararies, if I understand the makefile correctly.

But I am not sure. I might be playing around with this idea a bit, but if somebody happens
to know how to make that “one large monolithic” “God” C++ file, that when passed to the https://webassembly.org/ compiler will make the “binary” for a browser, then please let me know.

Cheers,

Jozsef

bgoodri · August 1, 2019, 6:30am

No one has figured out how to pass data to a Stan program compiled via Emscripten / WebAssembly. There is some virtual file system thing, but generally the sandboxing paradigm expects the data to be available at compile time or passed in as command line flags instead of read from an arbitrary file.

syclik · August 1, 2019, 7:39am

Most of the Math library is header-only. The only features you wouldn’t be able to use from the language are ODEs using a stiff solver (integrate_ode_bdf()). You also wouldn’t be able to use GPU or MPI, but I don’t think that’s the bottleneck here.

If you really wanted to try, search for “unity builds” or “unity builds c++.” There are automated ways of doing this… we just haven’t spent the time doing it. If you make any progress, please post here… there may be other devs that could help once it gets going.

Jozsef_Hegedus · August 4, 2019, 5:41am

there are a few different ways of doing this…

i look into this unity thing, i need no ode or gpu for a few “fun” demo calculations.

lemm look into that, cheers !

Jozsef_Hegedus · August 4, 2019, 5:45am

The data is baked into the executable, right? after c++ compilation , so I see no problem here.

If the stuff compiles then the data compiles with it too :) . That’s the easy part, afaiu.

Cheers,

J.

but let’s see, if quake runs on my phone then this will too :) - i am optimistic

Jozsef_Hegedus · August 4, 2019, 8:12am

so the basid c++ build goes well, on Mac, bernoulli example ran …

now maybe, what if I just replace c++ with escc … or what was its name :)

some of these ?

hmmm

but maybe i do that after a bit of sleep…

Jozsef_Hegedus · August 4, 2019, 8:34am

update, it seems to be compiling … something … and not exploding right away :

whatever that ever means …

katsotaan… as they say suomeksi :) in Helsinki… means … katsotaan :)

ok, it blew up …

i now try do a “make build” first … it cannot find stanc :)

i guess this loooks good ?

Jozsef_Hegedus · August 4, 2019, 9:22am

ok I get it, stanc actually should be run already in the browser

so, i need to run grab the bernoulli example from the C version of the stan code and then compile it
with emcc as well

i need to look into this emrun thing … and maybe enable that compiler flag…
stanc can be compiled but I cannot “run” it … well…

I can, in the browser, it does run and produces a file :

:)

lol

the C version worked out fine :

this is crunching away :

fingers crossed :

ok, it says it has been built :)

now, the next chapter is to find out what that really means …

later, aligator

well, this does not work :

since this wants to run in browsers (yes, compile stan to c++ in the browser :)))

but this works :

so basically, the Stan to C++ code translation should be carried out using the Stan that runs on my laptop, not in my browser … (or if you want to recompile the model in the browser then you use the .js version of stanc but then it should be called as stan.js or something)

I need to figure out how to do this :

with em++

ok… maybe later, maybe tomorrow… but, so far , there is a clean plan forwards… let see if it is a dead end… now, time to sleep, or something…

it seem that some funny bug is at work :

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120285.html

i guess i give this a rest… and check back later…

if someone is interested, then please ping me and I put up the files as .zip to my gdrive… this is still too experimental and github is not my strength…

i think i try this inside the latest ubuntu docker … and not on my macbook … if it works… then i can easily share the love :)

bgoodri · August 4, 2019, 4:40pm

I guess the data can be hard-coded, but that is only minimally useful for showing that Stan can sample from the implied posterior distribution. It would be much more useful if people could put their models on the web and other people could supply their own data, but no one has figured out how to circumvent the sandboxing measures that prevent
doing so.

Jozsef_Hegedus · August 4, 2019, 10:59pm

you mean client server architecture ?

i think i am pretty close to getting the bernouilli example runnning… in the browser

so, i would not be surprised if em++ would work also on js platforms … so , that means, if ppl give their models, then, EVEN ppl who download the app, they can compile the model from C++ into JS using em++ running on JS , and then sample from
that on their own client i got into som compiler bug… i will look into that later, nevertheless, it more or less seems to

BOTTOM LINE :

I have a browser on my mobile phone, dissconnected from the net. I have 1000 models and I want to try it out on 1000 different data. I have a mobile phone with a large battery. I am in the middle of the forest.

The proposed solution will make it possible for the person using the proposed code running it in the browser, on javascript (or “native” webassembly) on the mobile phone to “train the models” and sample from all the 1.000.000 combinations.

Without zero model pre-compiled for him/her to begin with.

Sounds too good to be true ? Let’s find out !

Let’s also say that he starts to write down those 1000 models onto a piece of paper when he/she arrives into the middle of the forest - at the center of the forest there is a big tree, which has big roots and big leaves. He sits down, leans back, against it, and starts the calculations. :)

Jozsef_Hegedus · August 15, 2019, 11:21pm

If somebody wants to give it a try I put the files here : https://github.com/jhegedus42/stan2js .

Once I figure it out how to do it, it will be there…

Jozsef_Hegedus · August 16, 2019, 9:34pm

Ok, so in that repo, the following eye candy works. This is a .cpp compiled to javascript.

This one :

github.com

jhegedus42/stan2js/blob/a223ce27f37ac66e30462569af1c80796929355b/emscripten_tutorial/sdl/sdl.cpp

// Copyright 2011 The Emscripten Authors.  All rights reserved.
// Emscripten is available under two separate licenses, the MIT license and the
// University of Illinois/NCSA Open Source License.  Both these licenses can be
// found in the LICENSE file.

#include <stdio.h>
#include <SDL/SDL.h>

#ifdef __EMSCRIPTEN__
#include <emscripten.h>
#endif

extern "C" int main(int argc, char** argv) {
  printf("hello, world!\n");

  SDL_Init(SDL_INIT_VIDEO);
  SDL_Surface *screen = SDL_SetVideoMode(256, 256, 32, SDL_SWSURFACE);

#ifdef TEST_SDL_LOCK_OPTS
  EM_ASM("SDL.defaults.copyOnLock = false; SDL.defaults.discardOnLock = true; SDL.defaults.opaqueFrontBuffer = false;");

This file has been truncated. show original

So if this one ^ can be compiled to JS, I am hopeful, I can compile Stan too… it’s a bit of an exercise for me to do that, but in the coming months, I will be playing around with this “project”. I think it would be cool. I keep you guys posted. I think I need to read a bit the makefiles in Stan and how it works… then I can figure out how to compile it to JS… give me a few months :).

Jozsef_Hegedus · October 1, 2019, 10:30am

update :

there will be docker image for working on this project - a good one …

Jozsef_Hegedus · October 1, 2019, 10:33am

it will be an open,public docker image, pullable, free, MIT, reusable, and it will cure all the problems of the world

bgoodri · November 3, 2019, 7:24pm

Upon further review, Chrome now has a File System API

Jozsef_Hegedus · November 3, 2019, 11:37pm

i mean, the compiler can run in the browser to right ? i would be surprised if that were not the case… so, ppl put their data into the app, the stan code compiles to js with the new data and generates samples …

i was under the impression that it is self evident that a c to js compiler can compile itself to js … if it is written in c … but this is just a theoretical assumption

bgoodri · November 3, 2019, 11:49pm

In theory, something like that could work. But you need the new file system API to pass data in and get CSV / JSON out.

Jozsef_Hegedus · November 4, 2019, 10:44am

New file system api compared to what ?

I mean, there is an SPA that runs in the browser, the user uploads the data into the browser’s memory, then the SPA compiles the data + Stan model to a .js file, which then will be executed.

I guess an SPA is allowed to recieve data from the filesystem, right ?

I am thinking about this because a few billion computers / mobile phones are running JS out of the box… this could give Stan quite a popularity boost. It would be a pretty nice way to advertise Stan.

For teaching, live examples, that can be run in the browser.

I mean, the real question is : will em++ compiler work in the browser too ?

I don’t see why it would not…

Cheers,

Jozsef

nhuurre · November 4, 2019, 11:41am

I doubt it; C/C++ compilers do not work in a vacuum. Setting up an environment that has all the required supporting infrastructure is going to be nigh-impossible.
I think the most promising path is first using Stan3 interpreter and then connecting it to WebAssembly JIT.
Yes, neither of those exists, but they’re at least the sort of things that could exist someday.

Jozsef_Hegedus · November 4, 2019, 12:30pm

Hmmm… I think I see the point… this is an interesting problem.
Not trivial at all.

But… still, it does not seem to be so impossible :

this seems to be running in my browser…

or i might be wrong…

anyway… no high priority stuff but it would have been interesting to run Stan in the browser …

Topic		Replies	Views
Translating stan model to C++ code (how to inline "everything") - and related efforts using em++ to get Stan running in the browser Developers	14	1523	November 4, 2019
Problems using translated stan model in C++ CmdStan cmdstan	4	748	November 16, 2021
Compiling error cmdstan example code CmdStan	14	1418	November 10, 2023
Use Stan easily from c++ Interfaces c++api	25	7456	May 16, 2018
Emscripten interface to model Project Proposals	5	1043	May 28, 2018

Compiling CmdStan to https://webassembly.org/ - how to make "one large" C++ file for a model, a C++ file that contains "everything needed", including Stan and Math routines

Related topics