Cannot compile a model in PyStan, but can in Rstan


#1

Hello. I am having trouble getting a Stan model to compile with PyStan 2.16.0. I have used the same model previously in Rstan, so I’m confident that model itself is okay. The following python code produces the compilation error for me:

from pystan import StanModel

rasch_code = """
data {
  int<lower=1> I;               // # questions
  int<lower=1> J;               // # persons
  int<lower=1> N;               // # observations
  int<lower=1, upper=I> ii[N];  // question for n
  int<lower=1, upper=J> jj[N];  // person for n
  int<lower=0, upper=1> y[N];   // correctness for n
}
parameters {
  vector[I] beta;
  vector[J] theta;
  real<lower=1> sigma;
}
model {
  beta ~ normal(0, 3);
  theta ~ normal(0, sigma);
  sigma ~ exponential(.1);
  y ~ bernoulli_logit(theta[jj] - beta[ii]);
}
"""

rasch_model = StanModel(model_code=rasch_code)

The error is:

CompileError: command 'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\x86_amd64\\cl.exe' failed with exit status 2

For comparison, I am able to compile (and sample from) this example model without problem:

from pystan import StanModel

schools_code = """
data {
    int<lower=0> J; // number of schools
    real y[J]; // estimated treatment effects
    real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
    real mu;
    real<lower=0> tau;
    real eta[J];
}
transformed parameters {
    real theta[J];
    for (j in 1:J)
    theta[j] = mu + tau * eta[j];
}
model {
    eta ~ normal(0, 1);
    y ~ normal(theta, sigma);
}
"""

schools_model = StanModel(model_code=schools_code)

I’m at a loss regarding next steps for troubleshooting, so I would be grateful for any help. Thank you.


#2

Hi, this error comes (probably) due to bug in MSVC.

See https://github.com/stan-dev/pystan/pull/351 and https://github.com/stan-dev/math/issues/434


#3

Does some version of Python require MSVC? I thought we’d decided not to support MSVC.

If Python version X.Y requires MSVC, would you mind flagging in the doc whatever won’t work in Python version X, compiler version Y? (If it were already there, presumably someone would’ve just pointed the user there.)


#4

On Windows, Python 3.5 onwards can only use MSVC 14+ compiler. Currently there are no compatible open source compilers due to refactor in MSVC codebase, which has created a copyright issue. Also, on Windows the Python 3.5+ is the only supported version. I do not know if 2.7 or 3.4 will even work (@ariddell) .

https://mingwpy.github.io/issues.html#the-vs-14-2015-runtime

The problem with MSVC 14+ is that its template deduction is buggy (for example it can not deduct the Eigen::Matrix types). I have not tested all the cases where this is a problem.

What I see as a viable option is to use httpstan to call CmdStan instead of Cython bindings. Then we could use mingw compiler on Windows. But that said, I don’t have any real solution to this problem on Windows. Currently I recommend to use Docker + Linux, but even that is suboptimal.


#5

Thanks. No wonder @ariddell wants a standalone server for Stan!

What’s the damage in terms of functions that won’t work in PyStan?

Is there any hope MSVC will get patched?


#6

I have a VM with Windows 10, so I could set up some tests for different distributions and functions. But what I have tested these before, the errors can be bizarre. Almost anything with Eigen::Matrix will fail.

Example. cpp multiply function works with multiply(A,x); (manually edit the stanc code) but I can not put this in to a variable (in stan code) or use it in function as a parameter.

It’s almost like MSVC precompiler goes over once and uses the resulting type to do its magic. And if these types are e.g. Eigen::Matrix or any function then it will fail.

MSVC probably could be patched, but I’m not even sure if earlier versions of MSVC compiled Stan code. Based on the Google, type deduction has been a problem long before 14 and they (MSVC devs) either don’t see this as a problem / bug or don’t know how to solve it.


#7

If I’m understanding the pieces correctly:

PyStan isn’t compatible with Windows and there’s not likely to be a patch any time soon.

Some simple models will run, but nothing with matrix types.

The alternatives seem to be:

  1. tell Windows users to use RStan or CmdStan or get a new OS, or

  2. rebuild PyStan from the ground up to call Stan compilation and sampling in a separate process.

As far as putting Stan in a remote process goes, we could either try to set up some kind of server that does everything or we could build an interface that execs calls to CmdStan.


#8

I’m confused. Things were working at some point with Python>=3.5 and
Windows MSVC 14, right? What changed exactly and when did it change? If
Eigen+MSVC>=14 is a problem, this affects many people, not just us.

I thought everything was supposed to be smoother with Python>=3.5 and
MSVC 14. (relevant blog post series by the authority on the subject is
here http://stevedower.id.au/blog/building-for-python-3-5/ )


#9

I think it’s not this bad. Let’s find out what the problem is.

The documentation is fairly clear about the fact that you must use Python>=3.5 on Windows. (see https://pystan.readthedocs.io/en/latest/windows.html ). We might add a note about being required to use MSVC>=14 at the top too.


#10

It looks like some other people have had problems with Eigen and certain (minor) versions of MSVC, see https://github.com/PointCloudLibrary/pcl/issues/1496


#11

I don’t know if MSVC has worked before.

I went and googled a little bit more to see what theano people did.

There is now MSYS based gcc on Windows (m2w64-toolchain). Now I am able to compile models on Python 3.5 + Windows.

# new conda environment
conda create -n stan python=3.6 numpy cython matplotlib libpython
activate stan
conda install -c msys2 m2w64-gcc=5.3.0
pip install pystan

then I needed to create distutils.cfg

https://docs.python.org/3.6/install/#location-and-names-of-config-files

[build]
compiler = mingw32

After these I needed to comment out windows extra compile args in pystan/model.py line 286

Now it uses msys2 gcc and compiles. #edit. (this is not mingw32)


#12

I’m sure that Python>=3.5 and MSVC 14 has worked at some point for at
least simple models like 8 schools. PyStan has been testing on Windows
using this configuration for some time. You can see a history of
successful builds and (minimal) tests:
https://ci.appveyor.com/project/ariddell/pystan/history

I’m happy dropping support for Windows if we can’t find a way to make
PyStan work with updated versions of MSVC 14. Without someone who knows
their way around the Windows compiler(s), it’s challenging to fix these
problems or respond to bug reports.


#13

It works for normal dist and other simple ones. We could say that use conda+m2w64(gcc) with Windows. This is what theano does.


#14

Sounds good to me. So they would install all the dependencies
(especially this m2w64 toolchain) using conda and then install the
Windows wheel of PyStan using pip?

(I’m looking at
http://www.deeplearning.net/software/theano/install_windows.html )


#15

Yes. We could also make conda w32/w64 packages so everything can be installed through conda.

I’m trying to figure out if we could somehow force/flag cython to use the gcc so user does not need to edit/create distutils.cfg file. The same flag could turn of the extra arguments for MSVC.


#16

That would be fantastic if we could get PyStan working fully on Windows. Sorry this is such a hassle—I blame it all on the poor C++ support for Windows. We’ve had huge headaches from day 1 getting things to run on Windows, even with g++ on developer machines.


#17

It is helpful to know that the problem is a limitation of the Windows implementation. I’ll keep an eye out for whether this is solved later. Thank you!


#18

Quick follow up, no need to respond. I figured out that I can run the model (on Windows) if I code it like this:

from pystan import StanModel

rasch_code = """
data {
  int<lower=1> I;               // # questions
  int<lower=1> J;               // # persons
  int<lower=1> N;               // # observations
  int<lower=1, upper=I> ii[N];  // question for n
  int<lower=1, upper=J> jj[N];  // person for n
  int<lower=0, upper=1> y[N];   // correctness for n
}
parameters {
  real beta[I];
  real theta[J];
  real sigma;
}
model {
  beta ~ normal(0, 1);
  theta ~ normal(0, 1);
  sigma ~ normal(0, 1);
  for(n in 1:N) 
    y[n] ~ bernoulli_logit(theta[jj[n]] - beta[ii[n]]);
}
"""

rasch_model = StanModel(model_code=rasch_code)

#19

Hi, see this issue in PyStan to fix compiling issues on Windows. https://github.com/stan-dev/pystan/issues/364

With these instructions you can use matrix-vector products.

We will make this official instructions when I (or someone else) have more (free)time.


#20

Thank you, this worked for me. I did not have to make changes to distutils.cfg, by the way. It already existed and had the needed lines.