BTW, here’s some context on the issue that I hope will provide some motive for supporting arrays with heterogeneous element shape. The two Stan functions I’ve written that really need this are a function to create a block diagonal matrix from the blocks, and a function to append vectors, both of which take an arbitrary number of arguments. These are used in creating statespace models for timeseries analysis.
The bigger picture is that I’m in the middle of creating a concise language for defining Bayesian statespace models, which compiles to Stan. Rather than directly specify the matrices that define the SSM, you define an SSM as the composition of primitive SSMs, and the compiler then works out the needed matrix expressions. In compiling to Stan, my compiler does the following:

Type, shape, and bounds inference for variables. The user supplies type, shape, and bounds for fixed model inputs (things that would go in the data block), but not for variables defined in the body of the model; the compiler figures these out.

Automatically put variables in the appropriate Stan model block. For each “let” variable (e.g., x = some_expression) or “draw” variable (e.g. x ~ some_distribution) the compiler figures out which Stan model block it belongs in.

Noncentered parameterization. For distributions with scale or scale and location parameters, the compiler automatically reparameterizes to get distributions with unit scale and zero offset.

Substantial support for quasiperiodicity.
For example the following program defines a SSM that is the sum of a random walk component, a quasiperiodic component with period 7, and white noise:
def main(sigma_q_scale, sigma_h_scale: real{0.0,},
mu_a: real, sigma_a: real{0.0,},
rho_mean: real{0.0,1.0}, sigma_p_scale: real{0.0,})
=
sigma_q ~ half_cauchy(sigma_q_scale);
sigma_h ~ half_normal(sigma_h_scale);
rho ~ exponential_mt(rho_mean, 1.0);
sigma_p ~ half_normal(sigma_p_scale);
accum(wn(sigma_q), mu_a, sigma_a) + wn(sigma_h)
+ qp(7.0, 0.7, 6, rho, sigma_p)
(exponential_mt
is a meanparameterized, truncated exponential distribution. qp
defines a quasiperiodic component, with rho
controlling the degree of departure from strict periodicity. wn
defines a whitenoise component. accum
essentially takes a time series model and integrates it.)
The current iteration of the compiler transforms the above into the following Stan program:
functions {
// definition for block_diag
// definition for exponential_mt_rate
// definition for scale_matrices
// definition for vec_append
}
data {
int<lower = 1> N;
vector[N] y;
real<lower = 0.0> sigma_q_scale_;
real<lower = 0.0> sigma_h_scale_;
real mu_a_;
real<lower = 0.0> sigma_a_;
real<lower = 0.0, upper = 1.0> rho_mean_;
real<lower = 0.0> sigma_p_scale_;
}
transformed data {
matrix[1, N] yy = to_matrix(y');
vector[7] Z_0_ = to_vector({1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0});
vector[7] a1_0_ = to_vector({mu_a_, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0});
}
parameters {
real<lower = 0.0> raw_0_;
real<lower = 0.0> raw_1_;
real<lower = 0.0, upper = 1.0> rho_;
real<lower = 0.0> raw_2_;
}
transformed parameters {
}
model {
real sigma_q_ = raw_0_ * sigma_q_scale_;
real sigma_h_ = raw_1_ * sigma_h_scale_;
real sigma_p_ = raw_2_ * sigma_p_scale_;
vector[6] csigma2s_0_ =
square(sigma_p_) *
to_vector({0.6182923634858543, 0.6182923634858543,
0.27566250834204353, 0.27566250834204353,
0.10604512817210211, 0.10604512817210211});
real rhoSqr_0_ = square(rho_);
real H_0_ = square(sigma_h_);
matrix[1 + 3 * 2, 1 + 3 * 2] T_0_ =
block_diag({
rep_matrix(1.0, 1, 1),
block_diag(
scale_matrices(
sqrt(1.0  rhoSqr_0_),
{to_matrix({0.6234898018587336, 0.7818314824680297,
(0.7818314824680297), 0.6234898018587336}, 2, 2),
to_matrix({(0.22252093395631434), 0.9749279121818236,
(0.9749279121818236), (0.22252093395631434)}, 2, 2),
to_matrix({(0.900968867902419), 0.43388373911755823,
(0.43388373911755823), (0.900968867902419)}, 2, 2)
}
)
)
});
matrix[1 + 6, 1 + 6] Q_0_ =
diag_matrix(
vec_append({
to_vector({square(sigma_q_)}),
rhoSqr_0_ * csigma2s_0_
})
);
matrix[1 + 6, 1 + 6] P1_0_ =
diag_matrix(vec_append({to_vector({square(sigma_a_)}), csigma2s_0_}));
raw_0_ ~ cauchy(0.0, 1.0) T[0.0, ];
raw_1_ ~ normal(0.0, 1.0) T[0.0, ];
rho_ ~ exponential(exponential_mt_rate(rho_mean_)) T[, 1.0];
raw_2_ ~ normal(0.0, 1.0) T[0.0, ];
yy ~ gaussian_dlm_obs(to_matrix(Z_0_'), T_0_, rep_matrix(H_0_, 1, 1), Q_0_, a1_0_, P1_0_);
}
As you can see, there are still some optimizations I need to include (constant folding in shape expressions, and lifting any subexpression having no parameter dependence into the transformed data block), but it’s automating a lot of the work that would otherwise go into creating such a model.
I’m doing this work for my employer Adobe, but I have gotten permission to make it an opensource project. Expect an initial release by the end of the year.