Improving Runtime & Reduce Sum for Hierarchical model with large number of groups

I had an additional follow up question for the sufficient statistics. I spent sometime going through the code for the sufficient statistics for multiple regression and at the moment it seems the framework is set up to handle multiple categorical or discrete variables (I believe the example you gave was time of day and predicting height). Is there a way to accommodate if there is an additional continuous variable? Or are we then unable to split the data into these unique groups to get sufficient statistics?
Here is was a simple simulation I was hoping to test with the Stan code you provided from your talk .

x1 = rnorm(100,0,1)
x2 = sample(c(1,2,3),100, replace = TRUE)
x3 = sample(c(1,2,3,4), 100, replace = TRUE)
y = 2*x1 + .5*x2 + 1*x3 + rnorm(100,0,.1)

functions{
    real normal_sufficient_lpdf(
        matrix obs
        , vector mean
        , vector variance
    ){
        return(
            normal_lpdf(
                obs[,1]
                | mean
                , sqrt(variance) ./ obs[,3]
            )
            + gamma_lpdf(
                obs[,2]
                | obs[,4]
                , obs[,4] ./ variance
            )
        ) ;
    }

}
data {
    // n_x_cols: number of predictor-variables in x (as columns)
    int<lower=1> n_x_cols ;
    // n_x_rows: number of combinations (rows) of the predictor variables in x
    int<lower=1> n_x_rows ;
    // x: predictor matrix
    matrix[n_x_rows,n_x_cols] x ;
    // obs_mean: mean of observations for each row in x
    vector[n_x_rows] obs_mean ;
    // obs_var: variance of observations for each row in x
    vector<lower=0>[n_x_rows] obs_var ;
    // obs_n: count of observations for each row in x
    vector<lower=2>[n_x_rows] obs_n ;
}
transformed data{
    matrix[n_x_rows,4] obs ;
    obs[,1] = obs_mean ;
    obs[,2] = obs_var ;
    obs[,3] = sqrt(obs_n) ;
    obs[,4] = (obs_n-1.0)/2.0 ;
    matrix[n_x_cols,n_x_rows] x_transpose ; // so we can use `columns_dot_product()`, which is faster than `rows_dot_product()`
}
parameters {
    // mu_intercept: value for the mean at the intercept
    real mu_intercept ;
    // mu_tod_beta: effect of each predictor on the mean
    vector[n_x_cols] mu_beta ;
    // eta_intercept: value for the log-variance at the intercept
    real eta_intercept ;
    // eta_tod_beta: effect of each predictor on the log-variance
    vector[n_x_cols] eta_beta ;
}
model {
    // priors (adjust when using real data!!)
    mu_intercept ~ std_normal() ;
    mu_beta ~ std_normal() ;
    eta_intercept ~ std_normal() ;
    eta_beta ~ std_normal() ;
    // likelihood
    obs ~ normal_sufficient(
        (
            mu_intercept
            + transpose(
                columns_dot_product(
                    rep_matrix(mu_beta,n_x_rows)
                    , x_transpose
                )
            )
        )
        , exp(
            eta_intercept
            + transpose(
                columns_dot_product(
                    rep_matrix(eta_beta,n_x_rows)
                    , x_transpose
                )
            )
        )
    ) ;
}