Mixed model with polynomial term - quadratic works, cubic won't converge

Hi all,
I am working on specifying a linear mixed effects model where the slopes \beta_i have a polynomial relationship to the intercepts \alpha_i. The goal is to estimate a linear mixed model and simultaneously model the relationship between the alphas and betas with a low-order polynomial.

I have been able to get this to converge with a quadratic polynomial term linking betas and alphas, but not with a cubic term - would appreciate any advice on how to implement the cubic in a way that might improve stability/convergence.

This is my first post so please forgive my ignorance and let me know if I can provide any further useful context - within reason as I am limited in discussing the data source/specific application due to IP constraints.

Below is the quadratic model specification and Stan code.

The time term is included in the top layer as \beta_i \cdot (t - \bar{t}_i) so that the alphas are interpretable as the mean of y_{ij} for each individual.

y_{ij} = \alpha_i + \beta_i \cdot (t - \bar{t}_i) + \epsilon_{ij}

\alpha_i \sim \text{Normal}(\mu_{\alpha}, \sigma_{\alpha}^2),
\beta_i \sim \text{Normal}(\gamma_0 + \gamma_1 \cdot \alpha_i + \gamma_2 \cdot \alpha_i^2, \sigma_{\beta}^2),
\epsilon_{ij} \sim \text{Normal}(0, \sigma_y^2), \\

Stan code:

data {
  int<lower=0> N_obs;       // number of total observations
  int<lower=0> N_ind;       // number of individuals
  int individuals[N_obs];   // individual index for each observation
  vector[N_obs] time_in_study;   // vector of times
  vector[N_ind] mean_time;  // mean time point for each individual
  vector[N_obs] AB;        // vector of outcomes

parameters {
  real mu_alpha;                         // Mean of individual-specific intercepts
  real<lower=0> sigma_alpha;             // SD of individual-specific intercepts
  vector[N_ind] alpha;                       // Individual-specific intercepts
  real gamma0;                           // Mean of beta when alpha is mean centered
  real gamma1;                           // Slope of beta with alpha
  real gamma2;                           // Slope of beta with alpha squared
  real<lower=0> sigma_beta;              // SD of individual-specific slopes
  real<lower=0> sigma_y;                 // Residual SD
  vector[N_ind] beta;                        // Individual-specific slopes

model {
  int ind;
  alpha ~ normal(mu_alpha, sigma_alpha);  // Prior for alphas
  beta ~ normal(gamma0 + gamma1 * alpha + gamma2 * square(alpha), sigma_beta);  // Prior for betas - quadratic polynomial
  for (i in 1:N_obs) {
    ind = individuals[i];
    AB[i] ~ normal(alpha[ind] + beta[ind] * (time_in_study[i] - mean_time[ind]), sigma_y);  // Likelihood
	// Priors for the hyperparameters
  mu_alpha ~ normal(43, 40);
  sigma_alpha ~ normal(41, 40) T[0,];
  gamma0 ~ normal(0.93, 3);
  gamma1 ~ normal(0.091, 1);
  gamma2 ~ normal(-0.00052,0.3);
  sigma_beta ~ normal(2, 5) T[0,];
  sigma_y ~ normal(5, 10) T[0,];    


However I have been unable to get a cubic term to work - i.e. adding gamma3 in the params block, and the corresponding line for the betas:

\beta_i \sim \text{Normal}(\gamma_0 + \gamma_1 \cdot \alpha_i + \gamma_2 \cdot \alpha_i^2 + \gamma_3 \cdot \alpha_i^3, \sigma_{\beta}^2),

With the line in the model block:

beta ~ normal(gamma0 + gamma1 * alpha + gamma2 * square(alpha) + gamma3 * (alpha .* alpha .* alpha), sigma_beta);  // Prior for betas - quadratic polynomial

Thanks for your time and support!

Hi again,

Just a quick update - I had some success by reparametrising the cubic polynomial in this ‘root’ form with some constraints on the parameters:

alpha_cubic > 0

theta_1 < theta_2 < theta_3
(L < R < M)

Where the cubic polynomial is implemented as

In parameters block:

ordered[3] theta;  // Ordered theta parameters

In model block:

  for (i in 1:N_ind) {
    // Modified cubic term
    real cubic_term = alpha_cubic * (alpha[i] - theta[1]) * (theta[2] - alpha[i]) * (theta[3] - alpha[i]);
    target += normal_lpdf(beta[i] | cubic_term, sigma_beta);

The model now seems like it is converging reasonably well, as long as I give the theta params reasonable priors and initial values to keep them from getting tripped up by the ordered constraint