Unit norm constraints and ANN

Hi guys,

I have successfully been able to run an ANN in Stan. Some background, I want to be able to include additive time series parameters to an ANN. In the end I want do some structural time series modelling combined with an ANN. Both have their own strengths (SS = time series, ANN = cross-sectional fit) and I want to combine these because of that. The model posted below still has time dummies, but I will replace that in the end with (probably) a random walk component.

data {
 	int<lower=0>          N;       // number of observations
        int<lower=0>	      Nt;      // number of time periods
        int<lower=0>          Np;      // number of explanatory variables
        int<lower=0>          Nh;      // number of knots (only 1 layer)

	vector[N]             yVar;    // explained variable
	int<lower=0,upper=Nt> sell[N]; // period sold, sell = 1, ..., Nt
        matrix[N,Np]          x;       // explanatory variables
}
parameters {
        vector[Nt]    mu;              // time series parameter
	ordered[Nh]   beta;            // bias, CONSTRAINT #1 
       
        matrix[Nh,Np] omega;           // weights in layer
        vector[Nh]    lambda;          // estimates measurement Eq.
        real<lower=0> sigEps;          // RMSE of measurement Eq.
}
transformed parameters {
	vector[N]    yHat;
	matrix[N,Nh] h;

	for(i in 1:N){
	  for(k in 1:Nh){
	    h[i,k]    = inv_logit( dot_product(x[i],omega[k]) + beta[k] );
	    }
	  yHat[i]     = mu[sell[i]] + dot_product(h[i],lambda);
        }
}						
model {
//	to_vector(omega)  ~ normal(0,5);  // can I do this simultanously with constraint #2?
	lambda            ~ normal(0,5);
	beta              ~ normal(0,5); 
        mu                ~ normal(0,5);
        sigEps            ~ normal(0,1);
 
        for(k in 1:(Nh-1)){
//	  sum(pow(omega[k],2))       ~ normal(1, 0.001 * Np); // Euclidean norm, CONSTRAINT #2 (doesn't work...)
//	  sum(omega[k] .* omega[k]) ~ normal(1, 0.001 * Np); // Euclidean norm, CONSTRAINT #2 (doesn't work...)
	  sum(omega[k])              ~ normal(0, 0.001 * Np); // Zero sum, CONSTRAINT #2 (works!)
        }
	yVar              ~ normal(yHat, sigEps);
}
generated quantities {
        // Training data stuff in here [LATER]
}

I know there has already been some work done on these ANN in Stan, but I think my code is a bit more straightforward if you only are interested in 1 layer. Anyways, I had two short questions.

  1. First of all, I would like to know if I can combine a zero sum constraint, with a “normal” prior in the model block. For example, in the model block, I would like both omega ~ normal(0,5); and sum(omega) ~ normal(0, 0.001 * Np), see code as well. However, I don’t think I can do both? Can I?
  2. Secondly, I actually do not want a zero sum constraint, but a Euclidian norm constraint (I believe this is also called a L2 norm?). Naively, this constraint would translate as sum(pow(omega, 2)) ~ normal(1, 0.001*Np). In words, the sum of the parameters squared should equal 1. Is there a way to get this constraint in Stan, without too much hassle? I gave two examples (commented out in the code) that didn’t work.

Note that all but 1 of the columns in omega actually has the constraint. The final column in omega can / will be left unconstraint! This makes it slightly more complex I guess… Any suggestions on how to speed up the code are obviously welcome as well!

Many thanks as always, you are the best!
Alex

Sorry for not getting to you earlier. Did you manage to make progress in the meantime?

There is a good discussion on sum to zero constraints at Test: Soft vs Hard sum-to-zero constrain + choosing the right prior for soft constrain (later in the thread there is a also a clever solution using the QR decomposition).

I am also unsure that using just one layer will prevent all the nasty computational issues that arise with neural networks.

Hope that helps at least a little