Calculating layers in a Neural Net

Bob_Carpenter · July 28, 2017, 11:01pm

For a two-layer network with tanh, I used this Stan function:

  /**
   * Returns linear predictor for restricted Boltzman machine (RBM).
   * Assumes one-hidden layer with logistic sigmoid activation. 
   *
   * @param x Predictors (N x M)
   * @param alpha First-layer weights (M x J)
   * @param beta Second-layer weights (J x (K - 1))
   * @return Linear predictor for output layer of RBM.
   */
  matrix rbm(matrix x, matrix alpha, matrix beta) {
    return tanh(x * alpha) * beta;
  }

More layers look just the same.

This is still a mess of inefficient autodiff compared to actually building the back-prop algorithm statically. So if we really wanted to do these efficiently, we’d write custom derivatives for functions like rbm() direclty in C++. Calling autodiff requires lots of extra space and is also slower. The direct C++ implementation would require very little memory and be at least 4 times faster.

But for reasons @betanalpha mentions and because this still isn’t going to scale in parallel, we haven’t been very focused on neural nets (aka deep belief nets).

Topic		Replies	Views
Bayesian Neural Network with Categorical Likelihood: efficient implementation Modeling	5	712	March 16, 2021
Arrays with varying subset dimensionality General rstan , techniques	2	440	December 1, 2020
Vectorization of sum of real*matrix terms - any ideas? Modeling	10	1620	May 16, 2017
Is there a way to improve the coding efficiency? Modeling rstan , techniques	3	48	September 25, 2024
Bayesian Neural Networks using R. Neal's code General neural-networks	21	5989	April 16, 2020

Calculating layers in a Neural Net

Related topics