Within-chain parallelization not working with cmdstanr on linux server

sdaza · November 9, 2021, 8:59pm

Hello,

I am running a model using cmdstanr and threads that don’t seem to work. On my server I just get:

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
1954950 sdaza     20   0   46024  15128   5764 R  98.1   0.0   4:10.33 model_threads

Where %CPU never goes over 100% . If I run the same model using brms and cmdstanr as the backend, the threads work.

My code is:

library(cmdstanr)
This is cmdstanr version 0.4.0
mm = cmdstan_model("src/model.stan", cpp_options = list(stan_threads = TRUE))
fit = mm$sample(data = dat, chains = 1, threads_per_chain = 10)

model.stan (1.2 KB)

Any ideas on what might be happening?

Thanks!
Sebastian

WardBrian · November 9, 2021, 9:02pm

Are you able to share your model (or at least the part that uses parallelization features)?

sdaza · November 9, 2021, 9:05pm

I added the model code in the question, so I have to add the parallelization into to stan code? Thanks!

WardBrian · November 9, 2021, 9:10pm

Your model doesn’t look like it is using map_rect or reduce_sum which are the core ways parallelism is used in Stan programs. I think you’re observing the correct behavior in that case.

See Stan User’s Guide - Parallelization

That said, I’m not the expert on parallelism so I’ll ping @jonah to make sure this isn’t a CmdStanR issue

sdaza · November 9, 2021, 9:13pm

Got it, thanks. Now, I am using a version with parallelization, but still I get an error:

model.stan (2.3 KB)

> mm = cmdstan_model("src/model.stan", cpp_options = list(stan_threads = TRUE))
Compiling Stan program...
Syntax error in '/tmp/Rtmpmi0sj4/model-1ddb5441e5e494.stan', line 21, column 12 to column 23, parsing error:
   -------------------------------------------------
    19:              real sigma,
    20:              vector sigma_county,
    21:              corr_matrix[] Rho ) { 
                     ^
    22:          vector[size(mx)] mu;
    23:          for ( i in 1:size(mx) ) {
   -------------------------------------------------

An argument declaration (unsized type followed by identifier) is expected.
make: *** [make/program:50: /tmp/Rtmpmi0sj4/model-1ddb5441e5e494.hpp] Error 1
Error: An error occured during compilation! See the message above for more information.

WardBrian · November 9, 2021, 9:24pm

Function arguments can’t be constrained types like correlation matrices, so that should just be matrix Rho I believe (ref manual page)

jonah · November 9, 2021, 9:28pm

Doesn’t look like anything specific to CmdStanR. You’re right that the within chain parallelization would require using functions like the ones you mentioned.

I think this is correct (the parameter should be declared as corr_matrix in the parameters block but a function should just take a matrix, which can be a corr_matrix or any other matrix). Unfortunately the error message only says “unsized type” and not “unconstrained type” so it’s a bit misleading. @WardBrian should we open a stanc3 issue related to the error message?

WardBrian · November 9, 2021, 9:33pm

It certainly should be better.
I can just open a PR to change it to An argument declaration (unsized and unconstrained type followed by identifier) is expected.

jonah · November 9, 2021, 9:35pm

Thanks, that would be great!

jonah · November 9, 2021, 9:38pm

@sdaza I would also recommend seeing if you can get the simpler example of reduce_sum from this case study working:

https://mc-stan.org/users/documentation/case-studies/reduce_sum_tutorial.html

If that doesn’t work then it will be easier to for us to figure out why within chain parallelization isn’t working for you compared to the more complicated example in the model you shared.

sdaza · November 10, 2021, 7:57am

Great, thanks so much @jonah and @WardBrian !

sdaza · November 10, 2021, 2:57pm

I ran a simple model with cmdstanr, and threads are working on my Linux server. So, my problem is with the stan code and the use of reduce_sum. I will try to solve it after reading the documentation.

Thanks!
Sebastian

sdaza · November 10, 2021, 4:13pm

I tried to run my model with parallelization but without success. Clarification: the stan code I am using was generated using the rethinking package.

When I change a bit the functions segment so that corr_matrix[] becomes just matrix[], still I don’t know what to do with the rest of the code to avoid semantic errors (sorry, I am new at this):

functions{
    real reducer( 
            vector mx,
            int start , int end , 
            int N,
            vector mx_sd,
            int[] time_period,
            int[] period,
            int[] time,
            int[] county,
            vector b_time_period_county,
            vector b_period_county,
            vector b_time_county,
            vector a_county,
            real b_time_period,
            real b_period,
            real b_time,
            real a,
            real sigma,
            vector sigma_county, 
            # corr_matrix[] Rho) { 
            matrix[] Rho) { 
        vector[size(mx)] mu;
        for ( i in 1:size(mx) ) {
            mu[i] = a_county[county[start+i-1]] + b_time_county[county[start+i-1]] * time[start+i-1] + b_period_county[county[start+i-1]] * period[start+i-1] + b_time_period_county[county[start+i-1]] * time_period[start+i-1];
        }
        return normal_lpdf( mx | mu , sigma );
    } 
}

Semantic error in '/tmp/RtmphvQFFv/model-387cb4564bdf94.stan', line 66, column 14 to line 83, column 17:
   -------------------------------------------------
    64:      YY ~ multi_normal( MU , quad_form_diag(Rho , sigma_county) );
    65:      }
    66:      target += reduce_sum( reducer , mx , 1 , 
                       ^
    67:              N,
    68:              mx_sd,
   -------------------------------------------------

Ill-typed arguments supplied to function 'reduce_sum':
(<F1>, vector, int, int, vector, array[] int, array[] int, array[] int,
 array[] int, vector, vector, vector, vector, real, real, real, real, real,
 vector, matrix)
where F1 = (vector, int, int, int, vector, array[] int, array[] int,
            array[] int, array[] int, vector, vector, vector, vector, real,
            real, real, real, real, vector, array[] matrix) => real
Available signatures:
(<F2>, array[] real, int) => real
where F2 = (array[] real, data int, data int) => real
  The first argument must be
   (array[] real, data int, data int) => real
  but got
   (vector, int, int, int, vector, array[] int, array[] int, array[] int,
    array[] int, vector, vector, vector, vector, real, real, real, real,
    real, vector, array[] matrix) => real
  These are not compatible because:
    The types for the first argument are incompatible: one is
     vector
    but the other is
     array[] real
make: *** [make/program:50: /tmp/RtmphvQFFv/model-387cb4564bdf94.hpp] Error 1
Error: An error occurred during compilation! See the message above for more information.

I have been looking for examples using correlation matrices without success. Any suggestion or guidance will be welcome. Thank you again!

Here the data: example.csv (773.9 KB)

jonah · November 10, 2021, 5:29pm

Ok great!

Are you going to be passing in a single matrix or an array of matrices? With this code the argument expects an array of matrices. For a single matrix you would just use matrix Rho and not matrix[] Rho.

There may be other issues with this reduce_sum code, but unfortunately I’m a bit swamped at the moment and don’t have time to debug the rest of it right now. Given that this is now a slightly different issue than in the original post, you could try starting a new topic here on the forum to get more eyes on this and hopefully more people can help debug.

sdaza · November 10, 2021, 9:03pm

Thanks, no problem. I tried also matrix Rho, but it didn’t work. I will create a new question, thanks.

Topic		Replies	Views
Cmdstanpy: multithreading issues (threads_per_chain) CmdStan cmdstanpy	2	523	December 13, 2023
Running cmdstanr in parallel on computing cluster General	6	1010	December 9, 2022
Trouble with within-chain parallelization with cmdstan (via cmdstanr) Modeling	4	872	August 21, 2020
Parallelization CmdStan	7	91	March 20, 2025
OpenCL and brms brms cmdstan , cmdstanr	5	897	May 18, 2021

Within-chain parallelization not working with cmdstanr on linux server

Related topics