Hi everyone,
I’ve been recently trying to use reduce_sum and STAN_THREADS on linux and so far wasn’t very successful.
My configuration is pretty standard ubuntu 20.04 linux, 64bit, gcc-9.3
I have cmdstan-2.27.0 which has STAN_THREADS=true in make/local (and cmdstan was recompiled after that setting).
I then have the following stan code (which is a bit of a toy example just for testing reduce_sum, it evaluates the spline on large vector of points)
functions{
#include spline.stan
real partial_lpdf_sum(int[] x_pos_knots_slice,
int start, int end,
int nknots,
vector xknots,
vector yknots,
vector spl_coeffs,
int N,
vector x,
vector y,
vector ey
)
{
vector[end-start] ymod;
ymod = spline_eval(nknots, xknots,
yknots, spl_coeffs, end-start+1,
x[start:end], x_pos_knots_slice);
return normal_lpdf(y[start:end]|ymod,ey[start:end]);
}
}
data{
int N;
int nknots;
vector[N] x;
vector[N] y;
vector[N] ey;
vector[nknots] xknots;
int grainsize;
}
transformed data
{
// determine which knots the point belong to
int x_pos_knots[N] = spline_findpos(nknots, xknots, N, x);
}
parameters
{
// the parameters of our spline model are
// the values at the knots
vector[nknots] yknots;
}
transformed parameters
{
vector[nknots] spl_coeffs = spline_getcoeffs(nknots, xknots, yknots);
// these are the spline coefficients corresponding to the current model
}
model
{
yknots ~ normal (0,100);
target += reduce_sum(partial_lpdf_sum, x_pos_knots, N,
nknots,
xknots,
yknots,
spl_coeffs,
N,
x,
y,
ey);
}
This code is then compiled using CMDstan with make command (and I clearly see the ‘-DSTAN_THREADS’ option being passed to the compiler)
When I run the compiled program
env STAN_NUM_THREADS=20 example_precompute id=1 random seed=434 data file=/tmp/fitting/uy9xoo6e.json output file=/tmp/fitting/example_precompute-202107211727-1-w9zlxy4w.csv method=sample algorithm=hmc adapt engaged=1
I see in the output
num_threads = 20
But in the same time
- top clearly shows 100%CPU for the corresponding process (and not more, indicating lack of threading activity)
- I’ve checked if there are any threads in /proc/$PID/task/ and I don’t see any.
So I’m how suspecting the threads are not used at all for some reason.
Does anyone have an idea what I’m doing wrong here ?
Thanks,
Sergey