Imagine we have two series y1 and y2, for example revenue and profit (cost is unknown), which are related. We want to model/predict both using some covariates X.

First I model them independently and it turns out the residuals from both models are correlated.

I then model (y1, y2) ~ multi_normal(mu, Sigma) where mu is a vector and Sigma is a 2x2 cov matrix. I find that the off diagonal element is not 0. How can I use this “better” model to make smarter inference or predictions?

If I just predict using mu, I miss the additional information in Sigma.

I guess what I am asking is how can I predict (y1, y2) better given that I know they are related?

1 Like

Sounds like a structural equation model should be your go to:

```
data{
int<lower=1> num_obs ;
matrix[num_obs,2] obs ; //priors assume this has been scaled
}
parameters{
vector[2] means ;
vector<lower=0>[2] sds ;
vector<lower=0,upper=1>[2] weights ;
vector[num_obs] common ;
matrix[num_obs,2] unique ;
}
model{
// Priors
means ~ std_normal() ;
sds ~ weibull(2,1) ; //peaked around .8
common ~ std_normal() ;
to_vector(unique) ~ std_normal() ;
// absence of a prior on weights implies uniform(0,1)
// Likelihoods
for(i in 1:2){
obs[,i] ~ normal(
means[i] + common*weights[i] + unique[,i]*(1-weights[,i])
, sds[i]
);
}
}
```

Technically that model is slightly over-identified in that both weights don’t really need to be positive-constrained; one could be permitted to range `-1`

to `+1`

, but if you already know they’re correlated decently, there’s not much harm to the lazy way I coded the above. Maybe double-check there isn’t much mass near zero for either though.

3 Likes

Thanks Mike, interesting idea, I read it as kind of a “mixture” model where we have a latent factor common that both series rely on. This might be tricky to use out of sample.

Also my mu is a function of X.

I will read a bit more about SEM.

Model the influence of your covariates on the common latent factor, that’ll then let you do out-of-sample prediction.

To extend the question a bit, maybe what I need is a conditional prediction/expectation

E(y1|pred(y2), X, Sigma)

I think this might be my answer