"Exception: assign array size" error only beyond certain number of data points

I am fitting an ODE system of two equations which corresponds to two bacteria growing exponentially in a density dependent (logistic) environment. The system is similar to the typical logistic growth equation

dN/dt = r N(t)(1-N(t)/K)

Where r is the intrinsic growth rate and K is the maximum density (carrying capacity) of the environment. For two competitors with an average growth rate r and deviations from growth rate s_{1} and s_{2}, my system of equation looks like this:

\frac{d}{dt}N_{1}(t) = N_{1}(t) r(1+s_{1})(1-(N_{1}(t) +N_{2}(t))/K) \\ \frac{d}{dt}N_{2} (t)=N_{2}(t) r(1+s_{2})(1-(N_{1}(t) +N_{2}(t))/K)

I have data for many (~1000) competition experiments in an all vs. all format for about 30 different individuals (i.e. I have N1 vs N2, N1 vs N3…N29 vs N30 and everything in between), and this set of two equations is great for me because it allows me to calculate an average growth rate r for all experiments. The controlled laboratory conditions allow me to assume K an N_{i}(0) to be constant and identical for all experiments.

The ODE solver was quite slow and I had a hard time asking it at first to fit a bunch of different competitions, keeping parameters r,K and N(0) to be global but only measuring different s_{i} between the different experiments.

Finally I found a way to do this and also parallelize it using reduce_sum_static with a custom function (partial_sum below) but now I am running into a peculiar problem. If the number of competition experiments in the dataset is 16 or lower, the model fits well, but if it is greater than 16, the grainsize seem to start going lower than 16 and taking smaller slices of data, which makes it mismatch with my user built function. Specifically, for datasets greater than 16 competitions I get the following error:

Exception: Exception: assign array size: assigning variable mu (16) and right hand side (8) must match in size (in ... line 22, column 4 to column 93) (in ... line 59, column 2 to column 97)

I was excited to get this working but I definitely can’t settle for running 16 competitions at the same time. But I am not sure if I even correctly interpreted my problem. I tried, instead of manually declaring the size of the array mu (which contains the numerical solution of the ODEs in the custom function), to allow mu to take on the size of the slice, but this causes uneven slicing which bleed from one experiment to another. This is not good obviously and I have seen this kind of error as well when I try that:

ode_rk45: times is not a valid sorted vector. The element at 17 is 0.72, but should be greater than or equal to the previous element, 15.7992

Indicating that the slice includes the end of one competition experiment (~15.8 being the final timepoint) and the beginning of another competition experiment (~0.72 being the first timepoint).

I attached a reprex below which I generated in R using the packages deSolve for numerical integration (didnt include details here but easy to show if needed ).

simulated_dataset.csv (18.1 KB)
fit_logistic_multiple.stan (2.7 KB)

In general since I am new to Stan I would appreciate if you could point ways to make my probably very ugly code more succinct and efficient. I wouldn’t mind if running this on all my 1000 experiments won’t take overnight to run.

Thank you kindly

Stan code (also attached)

functions{
  //ODE system
  vector logistic(real t, //time
                         vector y, // state
                         real r, //average growth rate
                         real K, //carrying capacity / maximum total density of both y
                         real s1, // relative growth (dis)advantage of y1
                         real s2 // relative growth (dis)advantage of y2
                        ) {
      vector[2] dydt;
      dydt[1] = r*(1+s1)*y[1]*(1-(y[1]+y[2])/K);
      dydt[2] = r*(1+s2)*y[2]*(1-(y[1]+y[2])/K);
    return dydt;
  }
  // function for splitting ODE system by experiment and fitting
  real partial_sum(array[] vector y_slice, 
  int start, int end, int nrowz,
  vector y0,real t0,array[] real ts,real r,real K,
  array[] real s1, array[] real s2, array[] int IDX, real sigma){
    real which_s1 = s1[IDX[end]]; // subsetting using 'end' value of reduce_sum though all values start:end should be identical for any IDX
    real which_s2 = s2[IDX[end]];
    array[nrowz] vector[2] mu = ode_rk45(logistic,y0,t0,ts[start:end],r,K,which_s1,which_s2); //numerical integration step
    real likely = 0; // initializing likelihood
    likely +=  normal_lpdf(y_slice[,1]|mu[,1],sigma); //likelihood for competitor y1
    likely +=  normal_lpdf(y_slice[,2]|mu[,2],sigma); //likelikhood for competitor y2
    return likely;
  }
}
data{
  int<lower=1> N; //rows in data
  int<lower=1> ID; //number of experiments
  array[N] int<lower=1,upper=ID> IDX; //experiment ID
  real t0; //initial timepoint (0)
  array[N] vector[2] yobs; //ob
  array[N] real ts; // timepoints
}
parameters{
  real<lower=0> inoculum; //y0 is the same for both competitors (experimental constraint)
  real<lower=0> r; // average growth rate of all competition experiments
  real<lower=0> K; // carrying capacity / max density
  array[ID] real s1; //array of growth (dis)advantages for all the different y1 tested
  array[ID] real s2;
  real<lower=0> sigma; //experimental noise hyperparameter
}
model{
  //priors
  sigma ~ exponential(1); 
  r ~ normal(1,1); //mostly 0.5-1.5
  K ~ normal(1,1); //mostly 0.5-1.5
  s1 ~ normal(0,0.1);  // expecting small differences from mean growth rate
  s2 ~ normal(0,0.1);  
  inoculum ~ normal(0.03,0.03); 
  
  //initialize y0 parameter constrained to be same for both competitors
  vector[2] y0 = rep_vector(inoculum,2);
  //manual calculation of grain size (all experiments have the same number of timepoints thankfully)
  int grainsize = N%/%ID;
  //likelihood reduce_sum loop
  target += reduce_sum_static(partial_sum,yobs,grainsize,
  grainsize,//second time specifying grainsize is to get number of rows 
  //(is that my problem: I need grainsize to be fixed rather than upper limit?)
  y0,t0,ts,r,K,s1,s2,IDX,sigma);
}


R code to show it works when number of competitions is 16 and not 17

library(tidyverse);library(cmdstanr)
options(mc.cores=8) ## have to adjust if different for you
#loading model file
fit_logistic_multiple = cmdstan_model('fit_logistic_multiple.stan',cpp_options = list(stan_threads = T))
# loading simulated dataset
simul_dat = read.csv('simulated_dataset.csv',row.names = 1)
# creating shortened version
simul_dat_short = simul_dat %>% filter(ID <17)

# data in list format for cmdstanr
dat_short = list(
  N=nrow(simul_dat_short),
  ID = length(unique(simul_dat_short$ID)),
  IDX= simul_dat_short$ID,
  t0=0,
  yobs = as.matrix(simul_dat_short[,c("A","B")]), # A and B are y1 and y2
  ts = simul_dat_short$time
)
# initial values makes it faster but it converges without
init_fun_logis = function() list(inoculum = 0.03, r=1,K=0.9,sigma=0.01)

# fit for short dataset
# note if you don't have 8 cores you will need to change the threading and chain numbers accordingly
fit_short = fit_logistic_multiple$sample(
  data = dat_short,threads_per_chain = 4,chains=2,iter_warmup = 500,iter_sampling = 500,
  init =init_fun_logis)



# Using this dataset fails!
dat_full = list(
  N=nrow(simul_dat),
  ID = length(unique(simul_dat$ID)),
  IDX= simul_dat$ID,
  t0=0,
  yobs = as.matrix(simul_dat[,c("A","B")]),
  ts = simul_dat$time
)


fit_full = fit_logistic_multiple$sample(
  data = dat_full,threads_per_chain = 4,chains=2,iter_warmup = 500,iter_sampling = 500,
  init = init_fun_logis
)



Update – it seems that as long as the groups are powers of 2, the code works, so 32 competition experiments works OK. Seems like an interesting quirk of reduce-sum but maybe this kind of quirk would make sense to someone who knows the function ‘under the hood’.

I have about 1200 groups of data, which is between the powers of 2 1024 and 2048, and unless I want to say goodbye to 180 experiments, I have a feeling my solution would be either to build a custom reduce_sum which, don’t know if thats possible, or to not parallelize it . Appreciate anyone’s thoughts

Based on this line I think the function you’ve defined is not really a good candidate for reduce_sum (at least as you have it defined here). From the documentation

The exact partitioning into the partial sums is not under the control of the user.

In general, the function processed by reduce_sum needs to be able to accept arbitrary partitions. In particular, you should be able to replace the call to reduce_sum with a direct call to the partial_sum function where start is 1 and end is the size of the entire input, and it should be correct (albeit slower). This does not seem to be true for your function