I am trying to fit an hierarchical regression, however some of the groups have no observations for certain IVs; so it translates to a ragged data set. I have followed the segmentation tutorial to apply segment to my X matrix, however have been having trouble applying the same to my parameters. I need to apply the segment/index logic to the following dot product below. (For context, certain local media tactics are not active in all of the dmas, they are only active in 7-10 dmas out of total 208)

```
...{ dot_product(cum_effects_hill[nn], beta[dma[nn]]) +
dot_product(X_ctrl[nn], gamma[dma[nn]])}...
```

```
model {
transformed parameters {
// a vector of the mean response
real mu[N];
// the cumulative media effect after adstock
real cum_effect;
// the cumulative media effect after adstock, and then Hill transformation
row_vector[num_media] cum_effects_hill[N];
row_vector[max_lag] lag_weights;
vector[num_ctrl] gamma[J];
vector[num_media] beta[J];
vector[J] theta;
for (nn in 1:N) {
for (media in 1 : num_media) {
for (lag in 1 : max_lag) {
lag_weights[lag] = pow(retain_rate[media], (lag - 1 - delay[media]) ^ 2);
}
cum_effect = Adstock(X_media[nn, media], lag_weights);
cum_effects_hill[nn, media] = Hill(cum_effect, ec[media], slope[media]);
}
mu[nn] = theta[dma[nn]] +
dot_product(cum_effects_hill[nn], beta[dma[nn]]) +
dot_product(X_ctrl[nn], gamma[dma[nn]]);
}
}
}
```

I am sorry but I don’t really understand the question - is the code you show BEFORE you tried to move to ragged data structures or AFTER or something in between? I also don’t know of any “segmentation tutorial” for Stan (and quick Googling didn’t help) - maybe providing a link would help us understand the context better. Finally showing the `data`

declarations might also be helpful to grasp what is happening.

Hope we can figure out a good way forward quickly :-)

Thank you for replying.

I was referring to the following

I have found a way around for a basic hierarchical regression, I just convert X_media and beta matrix to vectors, then continue with the dot product based on dma and variable/coefficient index. What I still have to figure out is how to apply the loop shown above to a ragged dataset

```
data {
// the total number of observations/rows
int<lower=1> K;
// rows with media data sizes
int s[K];
// rows with control data sizes
int p[K];
// the total length of media_vector; NAs removed
int<lower=1> V;
// the total length of control_vector; NAs removed
int<lower=1> C;
//the number of DMAs
int<lower=1> J;
//vector of media_dma indeces, for non_segment indexing
int<lower=1,upper=J> dma[K];
//vector of media_dma indeces
int<lower=1,upper=J> dmamedia[V];
//vector of control_dma indeces
int<lower=1,upper=J> dmacontrol[C];
// the vector of sales
real Y[K];
// the number of media channels
int<lower=1> num_media;
// the beta index
int<lower=1,upper=num_media> kk[V];
// the vector of media
row_vector[V] X_media;
// the number of other control variables
int<lower=1> num_ctrl;
// the controls index
int<lower=1,upper=num_media> jj[C];
// a vector of control variables
real<lower=0> X_ctrl[C];
}
parameters {
// residual variance
real<lower=0> noise_var;
// the population level intercept
real tau;
// the population coefficients for media variables
vector<lower=0>[num_media] beta_medias;
// popultaion level coffecients for other control variables
vector[num_ctrl] gamma_ctrl;
// group level coffecients for intercept
vector[J] theta;
// the group coefficients for media variables
vector[num_media] beta[J];
// the group coffecients for control
vector[num_ctrl] gamma[J];
// the standard deviation of intercept
real<lower=0>sigma_tau;
// the standard deviation of betas
real<lower=0>sigma_beta;
// sd of controll
real<lower=0>sigma_gamma;
}
transformed parameters {
row_vector[V] beta_vector;
row_vector[C] gamma_vector;
for(i in 1:V){
beta_vector[i] = beta[dmamedia[i],kk[i]];
}
for(i in 1:C){
gamma_vector[i] = gamma[dmacontol[i],jj[i]];
}
}
model{
int pos;
int pos_c;
vector[K] mu;
tau ~ normal(0,1);
sigma_tau ~ cauchy(0,2.5);
sigma_beta ~ cauchy(0,2.5);
sigma_gamma ~ cauchy(0,2.5);
for (j in 1:J){
theta[j] ~ normal(tau,sigma_tau);
}
beta_medias ~ normal(0,1);
for (j in 1:J){
beta[j] ~ normal(beta_medias,sigma_beta);
}
gamma_ctrl ~ normal(0,1);
for (j in 1:J) {
gamma[j] ~ normal(gamma_ctrl,sigma_gamma);
}
noise_var ~ normal(0, 0.05 * 0.01);
pos = 1;
pos_c = 1;
for(k in 1:K){
Y ~ normal(theta[dma[k]] + dot_product(segment(X_media, pos, s[k] ),segment(beta_vector, pos, s[k])),sqrt(noise_var) + dot_product(segment(X_ctrl, pos_c, p[k] ),segment(gamma_vector, pos_c, p[k])),sqrt(noise_var)); //compute the linear predictor using relevant group-level regression coefficients
pos = pos + s[k];
pos_c = pos_c + p[k];
}
}
```

Sorry for taking long to get back to you. Do I understand correctly that what you need is to have a `beta`

(and `gamma`

) coefficient for each value of `dmamedia`

. I.e. `dmamedia`

act as predictors. For any single `Y`

there is a single `dma`

and each `dma`

corresponds to a set of `dmamedia`

that are active?

If that is so, one way would be to just preprocess the data in R (or wherever you call your model from) to have a direct `matrix[V,K] dmamedia_expanded`

with 0-1 entries when the media is present / absent and then use it directly? (`real value = dot_product(dmamedia_expanded[,k], beta[dma[k]];`

(you could also build this in Stan, but that is IMHO more annoying).

Does that make sense? Or I am misunderstanding you?