Pre-complied cmdstan model 'cannot execute binary file' error

I have a cmdstanr programme that runs in R studio but doesn’t run on a computer cluster via MacOS Terminal. To try to overcome various problems I have encountered while trying to run the same programme on a computer cluster using Terminal, I have compiled the stan model in my local environment, then moved the resulting .exe file to the directory that I use when on the cluster. However, when I run this programme on Terminal, I run into the following problem:

> set_cmdstan_path("/storage/users/gmilne/test/.cmdstanr/cmdstan-2.25.0")
CmdStan path set to: /storage/users/gmilne/test/.cmdstanr/cmdstan-2.25.0

> file <- "stan_mod_simple.stan"

> compile_mod <- cmdstan_model(file)
Model executable is up to date!

> # fit the model to the data
> fit <- compile_mod$sample(
+   data = data_si,
+   seed = 123,
+   chains = 3,
+   parallel_chains = 3, 
+   iter_warmup = 5,
+   iter_sampling = 10,
+   refresh = 1
+   )
Running MCMC with 3 parallel chains...

Chain 1 ./stan_mod_simple: ./stan_mod_simple: cannot execute binary file
Chain 2 ./stan_mod_simple: ./stan_mod_simple: cannot execute binary file
Chain 3 ./stan_mod_simple: ./stan_mod_simple: cannot execute binary file
Warning: Chain 1 finished unexpectedly!

Warning: Chain 2 finished unexpectedly!

Warning: Chain 3 finished unexpectedly!

Warning: Use read_cmdstan_csv() to read the results of the failed chains.
Warning messages:
1: All chains finished unexpectedly!
 
2: No chains finished successfully. Unable to retrieve the fit. 

Which is strange because the Model executable is up to date! part suggests to me that the compiled model code is there and ready to use. When checking the architecture of the .exe file in a Terminal command, I find that it is a .64-bit file (the same as my operating system):

file test/stan_mod_simple
Mach-O 64-bit x86_64 executable, flags:<NOUNDEFS|DYLDLINK|TWOLEVEL|WEAK_DEFINES|BINDS_TO_WEAK|PIE>

I have previously attempted various other workarounds, including using the stanc2 compiler instead (as suggested here CmdStan on ARM error - #6 by asceles) while compiling the model on the cluster instead of my local environment. This results in a syntax error, as seen below, I suppose since my stan code is using the syntax from cmdstanr that the older compiler might not understand?

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Variable "mod_si" does not exist.
 error in '/var/folders/8w/byg0q9bx7r9bk20hswzlnh700000gn/T/RtmpnWDJlA/model-6b1d2dd99afc.stan' at line 205, column 10
  -------------------------------------------------
   203:   //run solver
   204:   y = ode_rk45_tol(
   205:     mod_si,  //model function
                 ^
   206:     init,    //vector initial values
  -------------------------------------------------


make: *** [/var/folders/8w/byg0q9bx7r9bk20hswzlnh700000gn/T/RtmpnWDJlA/model-6b1d2dd99afc.hpp] Error 253
Error: An error occured during compilation! See the message above for more information.

I would greatly appreciate any advice on what I might be doing wrong. Ideally I’d like to be able to compile the Stan model in my local environment and upload this to the directory I use while using the cluster. I’m not sure if the strategy I’m currently using is the correct one. I haven’t uploaded my model code here since it is quite long and it runs fine on my R studio local environment. I’d be happy to provide any additional information. Thanks.

Background info:

  • Operating System: macOS Mojave 10.14.6
  • R Version 4.0.3
  • CmdStan Version: cmdstan-2.25

This is unlikely to work except if the computer cluster also runs MacOS and even then macOS has to be of the same or newer versions.

Thanks for your quick reply, that makes sense. Is there another way that I could precompile the model in my local environment and save this for the cluster without running into compatibility problems?

One thing that comes to mind is to create a Docker image that you run on you laptop. Then you use the same image on the cluster?

Thanks for the insight. I haven’t used Dockers before but from a quick read it seems a way of packaging up all the commands and files necessary to run a script, which can be used independently of the OS?

I find it strange that Stan doesn’t seem to have a simple way of saving compiled code to a file that is compatible with a number of OS.

Yeah, you install an OS in your container and all the scripts etc. you need. If you can run Docker on the cluster you can simply download your container and run everything.

Concerning your second statement I wouldn’t agree with you. There are different operating systems and create a binary format that’d work on all is not possible. You would need run the code on an intermediate layer (think Java) for that to work and it would be slow (think Java) ;)

1 Like

Thanks for taking the time to explain, really appreciate it!

would it be so hard to compile once on the cluster?
I realize that on the cluster you want to run a bunch of processes and not have to compile every time - does CmdStanR let you instantiate a CmdStanModel from the exe file?

in CmdStanPy, on a cluster, you could use the following Python script to run a bunch of processes:

# User CmdStanPy to run one chain
# Required args:
# - cmdstanpath
# - model_exe
# - seed
# - chain_id
# - output_dir
# - data_file

import os
import sys

from cmdstanpy.model import CmdStanModel, set_cmdstan_path, cmdstan_path

useage = """\
run_chain.py <cmdstan_path> <model_exe> <seed> <chain_id> <output_dir> (<data_path>)\
"""

def main():
    if (len(sys.argv) < 5):
        print(missing arguments)
        print(useage)
        sys.exit(1)
    a_cmdstan_path = sys.argv[1]
    a_model_exe = sys.argv[2]
    a_seed = int(sys.argv[3])
    a_chain_id = int(sys.argv[4])
    a_output_dir = sys.argv[5]
    a_data_file = None
    if (len(sys.argv) > 6):
        a_data_file = sys.argv[6]

    set_cmdstan_path(a_cmdstan_path)
    print(cmdstan_path())

    mod = CmdStanModel(exe_file=a_model_exe)
    mod.sample(chains=1, chain_ids=a_chain_id, seed=a_seed, output_dir=a_output_dir, data_file=a_data_file)

if __name__ == "__main__":
    main()

No. Because right now an exe file does not provide enough information about how the model was compiled (for example was MPI, threads or opencl used?). That info is neede to know what we can or can not use.

Will be possible after Add ./model compile_info · Issue #887 · stan-dev/cmdstan · GitHub

I would be happy to compile once on the cluster but this hasn’t proved possible, since I get the following error when trying to do so:

> mod <- cmdstan_model(file)
Compiling Stan program...
bash: bin/stanc: cannot execute binary file: Exec format error
make: *** [make/program:53: /tmp/RtmpV0Gjyc/model-423358aa4003.hpp] Error 126
Error: An error occured during compilation! See the message above for more information.
Execution halted

Are you running this on a cluster with ARM CPUs? I am guessing you installed with install_cmdstan()?

This error does suggest the cluster uses ARM CPUs. If that is the case install cmdstan using:

install_cmdstan(release_url = "https://github.com/stan-dev/cmdstan/releases/download/v2.26.1/cmdstan-2.26.1-linux-arm64.tar.gz", cores = 4)