# Vectorized use of neg_binomial_2_lpmf

I’m having some difficulty creating a media mix model with a negative binomial distribution used to distribute the effect of marking spend on a given day to subsequent days. The following code with a double loop in the function `adstock_nbinom` works as expected, producing good estimates, but it is very slow. It applies `neg_binomial_2_lpmf` for every daily spend to every following day up to `adstock_steps` days in a time series.

``````functions {
vector adstock_nbinom(vector X, int n_timesteps, array[] int timesteps, real mean_shift, real concentration) {
int adstock_steps = 12; // This should capture >=98% of spend effect for mean_shift<=5
for (t_spend in 1 : n_timesteps) {
for (t_effect in t_spend : t_spend+adstock_steps) {
if (t_effect > n_timesteps) break;
* exp(neg_binomial_2_lpmf(t_effect - t_spend | mean_shift, concentration));
}
}
}

vector saturation_hill(vector X, int n_timesteps, real beta_hill, real kappa_hill) {
vector[n_timesteps] saturation;
for (t in 1:n_timesteps) {
saturation[t] = beta_hill * kappa_hill / (X[t] + kappa_hill);
}
return (saturation);
}
}

data {
int n_timesteps;
int n_channels;
int n_covariates;
matrix[n_timesteps, n_channels] X;
matrix[n_timesteps, n_covariates] covariates;
vector[n_timesteps] y;
vector[n_channels] betas_hill_prior;
}

transformed data {
int timesteps[n_timesteps] = rep_array(0, n_timesteps);

for (t in 2:n_timesteps){
timesteps[t] = timesteps[t-1] + 1;
}
}

parameters {
real<lower=0> intercept;
real trend;
real<lower=0> sigma;
vector[n_covariates] betas_cov;
vector<lower=0>[n_channels] mean_shift;
vector<lower=0>[n_channels] concentration;
vector<lower=0>[n_channels] betas_hill;
vector<lower=0>[n_channels] kappas_hill;

}

transformed parameters {
vector[n_timesteps] mu;
vector[n_timesteps] comp_trend = rep_vector(0, n_timesteps);
vector[n_timesteps] comp_intercept;
vector[n_timesteps] comp_covariates = rep_vector(0, n_timesteps);
vector[n_timesteps] comp_satstocked_X = rep_vector(0, n_timesteps);
matrix[n_timesteps, n_channels] saturation;

// Intercept component
comp_intercept = rep_vector(intercept, n_timesteps);

// Trend and covariates components
for(t in 1:n_timesteps) {
if (t == 1) {
comp_trend[t] = trend;
} else {
comp_trend[t] = comp_trend[t-1] + trend;
}
for (cov in 1:n_covariates) {
comp_covariates[t] += betas_cov[cov] * covariates[t, cov];
}
}

for(c in 1:n_channels) {
// Adstock & saturation do not commute (order matters)
// Adstock transformation first if: the media X in each time period is
// relatively small compared to the cumulative X across multiple time periods.
saturation[, c] = saturation_hill(adstocked_X[, c], n_timesteps, betas_hill[c], kappas_hill[c]);
comp_satstocked_X += saturation[, c] .* adstocked_X[, c];
}

// Combine all components
mu = comp_intercept + comp_trend + comp_covariates + comp_satstocked_X;
}

model {
// Priors
intercept ~ normal(0, 1);
trend ~ normal(0, 0.001); // Small spread since this is accumulates over timesteps
betas_cov ~ normal(0, 1); // Covariates contribution
mean_shift ~ normal(1.5, 1); // Delay of adstocked spend
concentration ~ normal(10, 5); // Concentration of adstocked spend
betas_hill ~ normal(betas_hill_prior, 1); // Base level efficiency of channel
kappas_hill ~ normal(0, 1); // Half saturation point (when channel efficiency starts to decline)
sigma ~ exponential(1);

// Likelihood
y ~ normal(mu, sigma);
}

generated quantities {
// Posterior predictive observations
vector[n_timesteps] yhat;
for (t in 1:n_timesteps) {
yhat[t] = normal_rng(mu[t], sigma);
}
}
``````

I’ve tried to take advantage of the vectorized nature of `neg_binomial_2_lpmf` by replacing the `adstock_nbinom` with the following function, but it displays some strange behavior, which tells me I’m probably doing something wrong. Specifically, the estimated parameters are quite different from the parameters used to generate the synthetic data I’m testing. Additionally, if I set `adstock_steps` > 3, the model fails to converge. `adstock_steps` controls the length of the integer array input to the `n` parameter for `neg_binomial_2_lpmf`.

``````  vector adstock_nbinom(vector X, int n_timesteps, array[] int timesteps, real mean_shift) {//, real concentration) {
int adstock_steps = 4; // why does this behave strangely at 4 but not 3?
for (t_spend in 1 : (n_timesteps - (adstock_steps - 1))) {
} else {
adstocked_X[t_spend:n_timesteps] += X[t_spend] * exp(neg_binomial_2_lpmf(timesteps[:n_timesteps-t_spend+1] | mean_shift, 10));
}
}
}
``````

I’ve attached the synthetic data used to train this model. It’s JSON data, but held as a .txt since JSON is not a supported file type on this website.

Any help would be greatly appreciated. I’m fairly new to probabilistic programming so advice on the general approach here would also be helpful.

After more digging, it looks like `neg_binomial_2_lpmf` returns a single real regardless of the size of the of the `ints y` parameter, which seems a little unintuitive. I’m not sure I understand the reasoning there and the Stan docs are unfortunately not very helpful! The same applies to `poisson_lpmf` which I tested as a substitute.

Anyway, here is a much faster version of the working function, which avoids the nested loop:

``````functions {
vector adstock_nbinom(vector X, int n_timesteps, real mean_shift, real concentration) {
int adstock_steps = 12; // This should capture >=98% of spend effect for mean_shift<=5

// Create the negative binomial PMF to represent the adstocked effect of spend