One-hot encoding or indexed coefficients

rfc · November 19, 2021, 7:22am

re: the second representation, the explicit if / else branches in transformed parameters collapse if we redefine beta_levels and pad it with an additional artificial beta levels “parameter” that is constrained to always take the constant value zero.

Pinning a parameter to a constant value does not seem to be allowed by Stan, but adding an extra layer of indirection gives something that compiles:

data {
    int n;
    int<lower=2> num_levels;
    array[n] int<lower=1,upper=num_levels> level;

    matrix[num_levels, 3] X;

    vector[n] actual_value;
}
parameters {
    vector[3] beta;
    vector[num_levels-1] beta_levels;
    real intercept;
    real<lower=0> sigma;
}
transformed parameters {
    vector[num_levels] beta_levels_;
    beta_levels_[1:num_levels-1] = beta_levels;
    beta_levels_[num_levels] = 0;

    vector[n] expected_value = intercept + (X * beta) + beta_levels_[level];
}
model {
    beta ~ normal(0, 1);
    beta_levels ~ normal(0, 1);
    sigma ~ student_t(3, 0, 1);
    actual_value ~ normal(expected_value, sigma);
}

I don’t have intuition about relative performance. If you’ve got the time, implement them all and look at what is happening through a CPU profiler!

Topic		Replies	Views
Efficient way of indexing General	2	368	April 21, 2020
Vector braodcasting from matrix by multiple indexing for parameter with two 'layers' Modeling techniques , specification	5	346	September 10, 2020
Representing categorical predictor variables Modeling	4	1528	April 18, 2019
Efficiency of design matrix multiplication vs. range indexing in the framework of a hierarchical model Modeling techniques	2	305	August 28, 2020
Using rows_dot_product in a loop to set column contents; any optimization ideas? Modeling techniques , specification	2	396	August 16, 2021

One-hot encoding or indexed coefficients

Related topics