Rstan vectorized modelling efficiency and its components setup

Hi all,

My model currently contains ~1000 variables
y ~ normal ( beta1x1[n] + beta2x2[n] + … , sigma)
with some terms defined as
beta * x ^ parameter
beta * 1 / (1 + exp(-0.05 * ((x - parameter1)) / parameter2)) - (1 / (1 + exp(-0.05 * (( 0 - parameter1)) / parameter2)))

From the manual it says that vectorized form is faster for Rstan, which prompted me to converting my program and data. Would it certainly be faster for this case where there would be several terms and parameters? I tried running it unvectorized and the model took too long to setup.

Another concern would be constraining beta vectors element-wise. Since rstan can only define 1 set of bounds for a vector, is this a valid work-around where we retain beta as scalar under parameters and under model define a vector which will group a set of beta of the predictors (got the idea from here)?
i.e.

parameters{
real <lower=0.5, upper=1.5> beta1
real <lower=0.2, upper=2.5> beta2

}
model{
beta1 ~ normal (mean, sd)
beta2 ~ normal (mean, sd)

vector [K] beta_vector = [beta1, beta2, …]’
y ~ normal(…)
}

Using rstan 2.17.3 a Windows 7 with 8GB ram. Would be glad for any help/clarification. Thanks!

Yes, that is legal. The word vectorization means different things in different contexts for Stan but the one you want to concentrate on is calling the likelihood as few times as possible or conversely calling it with the largest inputs as possible, ideally just

target += normal_lpdf(y | X * beta, sigma);

or something like that.

Thank you, Ben. Will try it out.