Predict two series which are not independent

Dirk_Nachbar · March 29, 2021, 1:06pm

Imagine we have two series y1 and y2, for example revenue and profit (cost is unknown), which are related. We want to model/predict both using some covariates X.

First I model them independently and it turns out the residuals from both models are correlated.

I then model (y1, y2) ~ multi_normal(mu, Sigma) where mu is a vector and Sigma is a 2x2 cov matrix. I find that the off diagonal element is not 0. How can I use this “better” model to make smarter inference or predictions?

If I just predict using mu, I miss the additional information in Sigma.

I guess what I am asking is how can I predict (y1, y2) better given that I know they are related?

mike-lawrence · March 29, 2021, 5:12pm

Sounds like a structural equation model should be your go to:

data{
	int<lower=1> num_obs ;
	matrix[num_obs,2] obs ; //priors assume this has been scaled
}
parameters{
	vector[2] means ;
	vector<lower=0>[2] sds ;
	vector<lower=0,upper=1>[2] weights ;
	vector[num_obs] common ;
	matrix[num_obs,2] unique ;
}
model{
	// Priors
	means ~ std_normal() ;
	sds ~ weibull(2,1) ; //peaked around .8
	common ~ std_normal() ;
	to_vector(unique) ~ std_normal() ;
	// absence of a prior on weights implies uniform(0,1) 
	// Likelihoods
	for(i in 1:2){
		obs[,i] ~ normal(
			means[i] + common*weights[i] + unique[,i]*(1-weights[,i])
			, sds[i]
		);
	}
}

Technically that model is slightly over-identified in that both weights don’t really need to be positive-constrained; one could be permitted to range -1 to +1, but if you already know they’re correlated decently, there’s not much harm to the lazy way I coded the above. Maybe double-check there isn’t much mass near zero for either though.

Dirk_Nachbar · March 30, 2021, 9:51am

Thanks Mike, interesting idea, I read it as kind of a “mixture” model where we have a latent factor common that both series rely on. This might be tricky to use out of sample.

Also my mu is a function of X.

I will read a bit more about SEM.

mike-lawrence · March 30, 2021, 11:39am

Model the influence of your covariates on the common latent factor, that’ll then let you do out-of-sample prediction.

Dirk_Nachbar · March 31, 2021, 12:44pm

To extend the question a bit, maybe what I need is a conditional prediction/expectation

E(y1|pred(y2), X, Sigma)

Dirk_Nachbar · March 31, 2021, 2:01pm

I think this might be my answer

Topic		Replies	Views
Correlated observations that aren't normal Modeling	9	1497	April 19, 2021
Single model for inferring multiple independant variables Modeling	1	349	May 3, 2021
Multivariate hierarchical outcomes correlated within a group Modeling techniques , performance	1	571	June 27, 2018
Blog post: Identifying non-identifiability Publicity	11	4259	June 6, 2018
Priors to get Positive-definite covariance matrix Modeling	5	669	March 19, 2021

Predict two series which are not independent

Related Topics