Hi. I am getting strange reduce_sum
errors in a very simple model running cmdstanpy.
Here is a pared down model that reproduces the error:
real reducesum_lpmf(int[] y, vector omega, int[] m) {
int num_categories = size(y);
real second_part = 1.0;
print(num_categories, " ", size(m));
return second_part;
}
real partial_sum(
int[ , ] selected_slice,
int start,
int end,
data int[ , ] incount,
vector omega
)
{
real ret_sum = 0;
print("start = ",start, " end = ",end);
for (n in start:end)
{
print(size(selected_slice[n]), " ", size(incount[n]));
ret_sum += reducesum_lpmf(selected_slice[n] | omega, incount[n]);
}
return ret_sum;
}
}
data{
int<lower=1> num_obs; // number of observations
int<lower=1> num_categories; // number of categories
int<lower=0> selected[num_obs,num_categories]; // number selected data
int<lower=0> incount[num_obs,num_categories]; // how many of each color
int<lower=0, upper=1> run_estimation;
}
parameters {
simplex[num_categories] omega; // the inferred weights of the Multivariate Wallenius Non-Central Hypergeometric Distribution
}
model {
int grainsize = 1;
omega ~ dirichlet(rep_vector(1.0/num_categories, num_categories));
if (run_estimation==1){
target += reduce_sum(partial_sum, selected, grainsize, incount, omega);
}
}
(this is a cutdown version of another model where I initially saw the error).
So the issue is when I compile the above model with:
from cmdstanpy import CmdStanModel
rs_model = CmdStanModel(stan_file="test_rs.stan",cpp_options={"STAN_THREADS": True})
#rs_model = CmdStanModel(stan_file="test_rs.stan")
and run it with
rs_fit = rs_model.sample(chains=2,
iter_warmup=250,
iter_sampling=250,
data=stan_data,
show_progress='notebook',
output_dir='./data',
adapt_delta = 0.9
)
I get nonsensical indices for my selected_slice
sizes.
An example subset of 2-stdout.txt output:
method = sample (Default)
sample
num_samples = 250
num_warmup = 250
save_warmup = 0 (Default)
thin = 1 (Default)
adapt
engaged = 1 (Default)
gamma = 0.050000000000000003 (Default)
delta = 0.90000000000000002
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
metric_file = (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
id = 2
data
file = /var/folders/77/nm2qxy_j7z90dhx6bhhvw47h0000gn/T/tmpdkku523n/90okg_sp.json
init = 2 (Default)
random
seed = 98851
output
file = test_rs-202009150959-2.csv
diagnostic_file = (Default)
refresh = 100 (Default)
num_threads = 1
start = 1 end = 1
10 10
10 10
start = 2 end = 2
1828068222379318548 10
-602317548 10
start = 3 end = 3
16349936537123897489 10
-1552727919 10
start = 4 end = 4
0 10
0 10
start = 5 end = 5
18446744073709551536 10
-80 10
start = 6 end = 6
1120149264 10
1120149264 10
start = 7 end = 7
18446744073709542896 10
-8720 10
start = 8 end = 8
18446744072635809792 10
-1073741824 10
start = 9 end = 9
18446744073709551552 10
-64 10
start = 10 end = 10
18446744073172811776 10
-536739840 10
start = 11 end = 11
3518391462204210
-457466790 10
start = 12 end = 12
18446708889794561840 10
457099056 10
start = 13 end = 13
0 10
0 10
start = 14 end = 14
853600405611371915 10
-593994357 10
start = 15 end = 15
16601505965085912400 10
2038000976 10
It appears that the indexing into reduce_sum
is wrong.
Any ideas?
I am running macosx 10.15.6 with CmdStan version 2.24.1 and cmdstanpy = 0.9.63