Evidently an evergreen category: how to impose constraints on transformed parameters

pnprice · April 18, 2022, 6:49pm

All,
I’m working with data on electric load – electricity consumption – by month, in a huge refrigerated warehouse. (I will be using this model for data from many warehouses). I want to fit a model to historical data and use it to make forecasts for the future. Someday I may have useful explanatory variables too but for now it’s just a time series model. I started by just using a SARIMA model but was unhappy with it various reasons, so I decided to use Stan and write a simple model that captures some real-world phenomena.

I currently have a Stan state space model that I’m fairly happy with. The reason I’m not entirely happy is that I want to apply constraints on both the lower and upper predicted value. These could be hard constraints – the load can’t possibly go below zero (in warehouses that don’t have solar generation), and can’t go above whatever load they would attain if they ran all their equipment on full blast all day and night. (Actually I’d rather have constraints that are firm but not sharp like that: there’s a very broad range where the value could be reasonable, so I don’t want to always pull things towards some central value, but once you start to approach one of the edges I’d like to start pushing back. Perhaps there’s some sort of flat-topped, curvy-edged plateau function I could apply, I dunno. But for now hard constraints would be fine).

The key features of the model are: (1) Some months have higher or lower load than other months, year after year. In some month they use “blast freezers” hour after hour because that’s when the strawberry harvest happens; then there are a couple months with lower consumption, and then the squid arrive in the area and suddenly they’re freezing squid hour after hour. (2) There are also multi-month excursions upwards or downwards: something happens in the economy so people are consuming more (or fewer) frozen meals, so the warehouse does more (or less) business, has the loading doors open more or less, etc. (3) Everything is constantly changing with time. In general terms – which months are the peaks and valleys – this year will probably look a lot like last year, but the details are different every year.

Before calling the function I standardize the data (subtract the mean, divide by the standard deviation).

As you can see in the code, the model has a ‘month effect’ and an effect that is local in time, and the prediction is the sum of these: ypred[t] = m[t] + v[t]. I assume m[t] is drawn from a distribution whose mean is the weighted mean of the month effects from one year ago and two years ago, with the weighting determined empirically (the ‘alpha’ and ‘beta’ parameters).

As I mentioned above, in this model I’m working with standardized data, so a hard constraint at 0 would turn into y_min = -(mean(y))/std(y), where y is the vector of historical data.

Here’s some R code to generate test data and run the model, and below is the Stan model itself (which should go into a file called ‘ts_model_scaled.stan’. I’d be happy to take advice on improving my model in any way at all, but the reason I’m here is to learn how I can apply constraints to ypred. Thanks very much for any help you can offer!

n_test = 36
y_test = 1000 + 500*cos((1:n_test)*pi/6) + rnorm(n = n_test, mean = 0, sd = 200)
mn = mean(y_test)
std = sd(y_test)
y_scaled = (y_test - mn)/std

#y_scaled_fake = rep(y_scaled[1:12], 5)
#y_scaled = y_scaled_fake + rnorm(n=length(y_scaled_fake), mean = 0, sd = 0.2) + (1:length(y_scaled_fake)) * 0.01

seasonality = 12
nT = length(y_scaled)
nP = nT + 2*seasonality # our choice: how far into the future do we want to forecast?
change_scale_local = sd(diff(y_scaled,1)) + 0.1
change_scale_seasonal = sd(diff(y_scaled,12)) + 0.1

input_list = list(s = seasonality, change_scale_local = change_scale_local,
                  change_scale_seasonal = change_scale_seasonal, 
                  nT = nT, nP = nP, 
                  component_max = 2,
                   y = y_scaled )

fit <- stan(file = 'ts_model_scaled.stan', data = input_list)


summary(fit)

vals = extract(fit)

ypred_mean = apply(vals$ypred, 2, mean)
ypred_min = apply(vals$ypred,2, min)

plot(vals$ypred[1,]*std + mn, type ='l', xlab = 'month', ylab = 'kWh', ylim = c(0,4500))
for (j in 2:200) {
  lines(vals$ypred[j,]*std + mn)
}
lines(ypred_mean*std + mn, col = 'red', lwd = 2)
lines(y_scaled*std + mn, col = 'blue', lwd = 4)

Stan code follows:

// State state model: 
// ypred =  m[t-s] + v[t]
// (In my application: month effect + 'local' time effect)
//
data{
  int s; // seasonality 
	int<lower=0> nT; // number of load data points
	int<lower=0> nP; // number of points to predict & forecast, total.
	real<lower = 0> change_scale_seasonal; // expected scale of seasonal changes
	real<lower = 0> change_scale_local; // expected scale of local changes
	real component_max;
	real y[nT]; // load data
}

parameters {
  vector<lower = -component_max, upper = component_max> [nP] m; // month effect
  vector<lower = -component_max, upper = component_max> [nP] v; // local effect
  real<lower = -1, upper = 1> alpha_local; // autoregressive multiplier for local changes.
  real<lower = -1, upper = 1> beta_local; // autoregressive multiplier for local changes.
  real<lower = -1, upper = 1> alpha_seas; // autoregressive multiplier for local changes.
  real<lower = -1, upper = 1> beta_seas; // autoregressive multiplier for local changes.
  real<lower = 0> sigma_level; // std of changes in local terms
  real<lower = 0> sigma_seas; // standard deviation of changes in seasonal terms
  real<lower = 0> sigma; // standard deviation of errors
}

transformed parameters{
  vector [nP] ypred;
  ypred = m + v;
}

model {
  for (t in 3:nP)
    v[t] ~ normal(alpha_local*(v[t-1] + beta_local*v[t-2]), sigma_level );
    
  for (t in (2*s+1):nP)
    m[t] ~ normal(alpha_seas*(m[t-s] + beta_seas*m[t-2*s]), sigma_seas );
  
  for (t in 1:nT)  
    y[t] ~ normal(ypred[t], sigma);
 
  alpha_local ~ normal(0, 0.5);
  alpha_seas ~  normal(0, 0.5);
  beta_local ~ normal(0,0.5);
  beta_seas ~ normal(0,0.5);
  sigma_level ~ normal(0,change_scale_local);
  sigma_seas ~ normal(0,change_scale_seasonal);
  sigma ~ normal(0, change_scale_local/12);
}

andrjohns · April 19, 2022, 3:31am

For constraining transformed parameters, you have to ‘work backwards’ and place the constraints on the parameters being transformed. This can be tricky if you have multiple parameters and complex transformations, but luckily you have a simple linear combination (m + v). For this, you just need to express the constraint for one parameter in terms of the other.

So the constraint you are after is:

LB < (m+v) < UB

Subtracting m from both sides gives the constraint:

(LB - m) < v < (UB-m)

So in your Stan code, you would replace the current constraints for v (as they will be implicitly fulfilled by m) with:

  vector<lower = (LB-m), upper = (UB-m)> [nP] v; // local effect

Where LB and UB are the desired bounds for ypred

pnprice · April 19, 2022, 5:04am

Ah, it somehow had escaped me that the limits in the parameter block can be dynamic! Thanks, this is perfect.

jsocolar · April 19, 2022, 2:51pm

FWIW, the other option here is to go with something more like the GLM approach, which constrains the expectation of the response using an inverse-link function. You could use, for example, a logistic inverse link between your desired bounds. This fundamentally reshapes the model, because it assumes that the autoregressive process is linear on the link scale rather than the identity scale. Up to you to decide whether this is a feature or a bug. One advantage is that you won’t run into weirdness if the terms alpha_local, beta_local, alpha_seas, beta_seas, etc ever come into strong conflict with the constraint on ypred.

Note also that imposing bounds on ypred does not impose bounds on the response y. For that, you might want to consider distributions other than normal. As is, y will be soft-constrained near your bounds depending on the estimated value of sigma.

pnprice · April 20, 2022, 4:47pm

I like the idea of firm constraints rather than hard constraints, at least at the upper end (lower end can still be a hard constraint: if there’s no generation on site then there’s no way to attain anything lower than zero). The approach you suggest does seem to provide a way to do that, at the price of a bit of modeling pain. This gets into the (very broad) range of territory that represents topics I find interesting but that I can’t really justify charging the client for. I should do more of that sort of thing in my spare time, but somehow I don’t. Anyway thanks for the suggestion, I’ll keep it in mind and will explore it if I need it.

pnprice · April 20, 2022, 5:04pm

This is somewhat embarrassing, something I’m sure should be immediately obvious to me. The line

vector<lower = (lower_bound - month_effect), upper = (upper_bound - month_effect)> [nP] mu;
generates an error when I try to compile in Stan (V. 2.21.0):

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Expression denoting real required; found type=vector.
 error in 'model1864c2f80751b_ts_state_space' at line 13, column 16
  -------------------------------------------------
    11: parameters {
    12:   vector[nP] month_effect; 
    13:   vector<lower = (lower_bound - month_effect), upper = (upper_bound - month_effect)> [nP] mu;
                       ^
    14:   vector[nP] v;
  -------------------------------------------------

I figured the problem is that lower_bound is a real whereas month_effect is a vector, so I tried creating a vector of lower_bound, but that didn’t work either.

You’re going to point out something obvious and I’m going to feel like a fool. That’s fine, I’m ready, bring it on.

This is not a blocker for me – it occurred to me, duh, I can simply discard realizations that I don’t like, post facto, that’ll be fine for my purposes – but I’d still like to know the answer. Thanks much!

andrjohns · April 20, 2022, 5:19pm

Ah the version of Stan in the current rstan doesn’t have support for vector bounds. You can either install the preview of the next rstan version using:

remove.packages(c("StanHeaders", "rstan"))
install.packages("StanHeaders", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
install.packages("rstan", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))

Or you can use the cmdstanr package instead: Getting started with CmdStanR • cmdstanr

pnprice · April 21, 2022, 5:15am

Aha! Well, I don’t feel as foolish as I expected to! A little foolish, yes, but not terribly so. Thanks!

Topic		Replies	Views
Unit norm constraints and ANN Modeling	1	751	November 9, 2020
How to do constrained optimization of the joint posterior Modeling	7	651	December 5, 2020
Constraints on a transformation of multiple parameters Modeling	21	1340	November 3, 2020
Constrained latent non-centered time series Modeling	6	1134	May 14, 2018
Imposing constraint in transformed parameters block Modeling	4	3025	May 1, 2018

Evidently an evergreen category: how to impose constraints on transformed parameters

Related topics