Fitting a mixture model in which a predictive variable is itself the result of a mixture model

This is partly a Stan question and partly a general modeling question.

I’m working with data on time-of-use wholesale electricity prices as well as hedge prices. For now I’m looking at month-average prices. “Time-of-use” prices, also called “spot prices”, are what you pay if you buy as you go. An alternative is to buy a “hedge”: you can buy a set number of kilowatt-hours a day in advance, or a month in advance, or a year in advance, or whatever, at a price that is set by a trading market.

Month to month the mean spot price goes up and down. There’s a seasonal pattern but also a lot of bumpiness. There can be an expensive month or an expensive summer, or a cheap spring, or whatever. Of course there are also long-term trends and short-term trends and so on.

Let me take Texas as an example, and just consider hedges bought 6 months ahead. If you wanted to buy electricity six months in advance, for electricity to be consumed in summer 2016, you ended up paying way more than if you had bought on the spot market (by about a factor of two). Same for 2017. And in 2018 you paid waaaay more: the price for a six-month-ahead hedge was $175 per MWh but the mean spot price for the month ended up being only about $40. So, OK, why would anyone ever buy electricity six months ahead, if it’s always so much more expensive? Well, summer 2018 there was a huge spike in electricity prices, which went up to $200, far higher than had ever been seen before. The 6-month-ahead price for summer 2018 was the same as summer 2017, about $175, so if you had bought the hedge you came out ahead. Basically: “the market” realized that a hot summer with less-than-normal wind in Texas could lead to a big jump in electricity prices, and that was priced into the hedge price every summer…and then one summer it actually happened. You could say the market saw it coming.

There are also events that ‘the market’ doesn’t see coming. Electricity prices in winter in Texas were always quite low, about $20 per unit, and “the market” didn’t think that was likely to change…and then there was a huge ice storm in February 2021 and electricity price went all the way to the regulatory limit of $6000 for a short time, and was high for two weeks. The monthly average ended up being almost $2000, something like 90x normal. (My client ran through their entire year’s energy budget in two weeks, in spite of taking energy-saving steps.)

I am working on a statistical model of prices. I’ve considered various long-tailed models and have settled, for now, on a mixture model. Let me start by describing it without any exogenous variables. It would just be a time series model: in a ‘normal’ month (a ‘component 1’ month) the expected price is predicted with a time series model that has monthly effects, a trend, some ‘noise’; the monthly effects can change from year to year but have expected mean of zero; etc. Very standard. But in a ‘component 2’ month the price have some big additional component added on, where that component is drawn from a wide distribution with a high mean value. ‘Component 2’ months are rare, with any month having something like a 5% chance of being a comp 2 month (or maybe 1%, or maybe 8%). If you only have a few years of data, you might not have any component 2 months.

I know how to fit the model above in Stan, or at least I think I do. But not because I have a deep understanding of how to do it in Stan, I simply took the code at the bottom of the mixture model page in the Stan manual and added the terms that I need, e.g. if I just want to include a month effect I have a line
lps[k] += normal_lpdf(y[n] | mu[k] + month_effect[n], sigma[k]);
where the original model didn’t have the month_effect term; and of course I set priors on on the month effects.

So far so good, but: I want to include the six-month-ahead hedge price as a predictive variable for the spot price. After all, this is what ‘the market’ has concluded is a fair price for buying electricity in advance. Perhaps it can be thought of as a prediction of the arithmetic mean of the spot price distribution in six months (plus a premium of unknown size). It includes the knowledge that in some months the distribution is much wider than in other months, i.e. that the month might turn out to be a ‘component 2’ month.

So I don’t want to say something like

lps[k] += normal_lpdf(y[n] | mu[k] + month_effect[n] + alpha*hedge_price[n], sigma[k]); 

or at least I don’t think I do, because hedge_price[n] is neither a forecast of the ‘component 1’ price nor the ‘component 2’ price, instead it is something closer to the weighted arithmetic mean of the two.

Hmm. OK, on the one hand I’m realizing that I can probably make more progress myself before coming here for help. On the other hand I’m realizing the depth of my confusion about what it means to “sum out the responsibility parameter.”

The hedge price for a given month reflects both the market’s belief in what the price will be if the month is a normal (component 1) month, and also the probability that the month will be a crazy (component 2) month. The probability changes from month to month. When a month has a really high hedge price, that’s almost certainly due to the market thinking that the probability is relatively high that it will be a crazy month.

Let’s imagine that there are only two kinds of months: (a) months with a low six-month-ahead hedge price, which also turn out, six months later, to have a low spot price; and (b) months with a high six-month ahead hedge price, which may turn out to have a low mean spot price or a high mean spot price.

I want to fit a model to historical data that has those characteristics, and then use it to make a forecast for a month six months from now, for which I have the six-month-ahead hedge price. If the hedge price is low then I can use the relationship between spot price and hedge price from just the months described in (a), and if the hedge price is high then…hmm, I’m not sure.

OK, I don’t have a question here, I guess I’m not ready for this forum. Posting this anyway just to establish the thread. I will be back with more. I guarantee that whatever I come up with as a model, I will need help with coding it in Stan. It is confusing to me to not be able to use a latent variable for whether a past or future month is a component 1 month.

Take this with a grain of salt, as I don’t have much experience with mixture models, but I am wondering if you want hedge_price to predict the mixing proportion theta in addition to the component 1 spot price. I’ve never programmed a mixture model in Stan, but I am thinking something like this in brms:

mix <- mixture(gaussian, gaussian)
fit1 <- brm(bf(y ~ 1, mu1 ~ month_effect + hedge_price, mu2 ~ month_effect, theta2 ~ hedge_price), 
data=dat, family = mix, prior = prior)

Code taken and modified from the brms manual, but again, not much experience with this myself (see mixture() in the manual).
Just thought I would throw out that idea after reading your post.

Yes, I do indeed want hedge_price to help predict the mixing proportion! Indeed, with the (non-Bayesian) model I am currently using, I make a price forecast that doesn’t use the hedge price (and that I interpret as the expected price for a non-crazy month), and then assume that the hedge price is a weighted average between that forecast and some large value that represents the expected price for a crazy month. Thus, the more the hedge price exceeds the forecast price, the higher the probability of a crazy month.

So, yes, conceptually very similar to what you are suggesting here. But I think this is leaving information on the table (both your suggested approach, and my current approach): yes, the hedge price informs the mixing parameter, but it also should inform the expected price in a non-crazy month. Basically there are two reasons the hedge price can be high: (1) the whole statistical distribution of possible prices shifts upwards, and (2) the market thinks there is a high probability of a crazy month. These are not mutually exclusive. For instance, right now hedges for next year are expensive, because (1) natural gas prices have gone way up due to the war in Ukraine constraining supplies, and (2) the market thinks there’s a higher-than-average chance that things will get a lot worse. So both phenomena are in play with next year’s hedge prices.

Anyway thank you very much for your thoughts and for the sample code. I do think it’s the right general idea.

Hmm, so I was thinking the brms code I suggested was doing just that.

Say mu1 is the mean for the price in a usual month and mu2 for a crazy month.
So, mu1 ~ month_effect + hedge_price models the usual month using time and the hedge price. So, the hedge price informs the expected price in a normal month. The mu2 ~ month_effect models the price in a crazy month using time. This likely would be better as mu2 ~ hedge_price, based on what you have said, though. The mixing proportion is also predicted using hedge price in theta2 ~ hedge_price. So I think in this model, the hedge_price is informing both the expected price in a normal month and the mixing proportion.
So, working off of what I suggested, I might think of something more specifically like:

mix <- mixture(gaussian, gaussian)
fit1 <- brm(bf(y ~ 1, mu1 ~ s(month_name, bs="cc") + s(month)  + s(hedge_price), mu2 ~ s(hedge_price), theta2 ~ s(hedge_price) + s(month_name, bs="cc")), 
data=dat, family = mix, prior = prior)

Where month_name is Jan, Feb, etc and month is just 1:length(n) time. If I understand the brms syntax and mixture models correctly, then this would use a cyclic cubic spline for a seasonal effect for month, penalized thin plate splines for time and for hedge price, to predict the mean price in a usual month. The mean price in a crazy month would be predicted by a penalized thin plate spline for the hedge price. The mixing proportion would be predicted by the cyclic cubic spline for the seasonal effect and a spline for hedge price. Thus, the hedge price would inform the usual price in a non-crazy month, the price in a crazy month, and the mixing proportion. The splines would make sense to me, if the effect of the hedge price on both the usual price and the mixing proportion isn’t linear as hedge price increases. Using multiple predictors for the “month effect” would also make sense to me as you could see both seasonal trends and long term trends separately (I assume you likely have already done this and month_effect was just a place holder for all that jazz).
Assuming this would converge lol…that’s another issue.

I guess you could do better with some non-linear multivariate mixture model if you knew more of a generative relationship between the hedge price and the normal price.