My stan model looks like the following (saved in a file called unpooled.stan
):
When including Stan code in your post it really helps if you make it as readable as possible by using Stan code chunks (```stan) with clear spacing and indentation. For example, use
data{
// size of data
int<lower=0> N; // number of training instances
int<lower=0> K; // number of input variables
int<lower=0> J; // number of landcover classes (group-level; J)
// data
matrix[N, K] X; // input matrix (X)
vector[N] y; // target vector (Y)
int<lower=0, upper=J> landcover_idx[N]; // indicator variable for each landcover class (u)
}
parameters{
vector[J] alpha; // intercepts (a_j[i])
matrix[J, K] beta; // coefficients (b_j[i])
real<lower=0> sigma;
}
model{
// priors
alpha ~ normal(0, 1);
// beta ~ normal(0, 0.5);
// likelihood
vector[N] y_hat;
for (n in 1:N){
y_hat[n] = alpha[landcover_idx[n]] + beta[landcover_idx[n]] * X[n]);
}
y ~ normal(y_hat, sigma);
}
The python code for creating the data and sampling from it using cmdstanpy
import pandas as pd
import numpy as np
from cmdstanpy import cmdstan_path, CmdStanModel, CmdStanMCMC
def make_dummy_data():
variables = [f"SM_lag_{i}" for i in range(6)]
variables += [f"PCP_lag_{i}" for i in range(6)]
variables += [f"VCI_lag_{i}" for i in range(6)]
N_samples = 100
X_train = pd.DataFrame({
var: np.random.random(N_samples)
for var in variables
})
y_train = np.random.random(len(X_train))
lc_idx = np.repeat([0, 1, 2], 20)[:len(X_train)]
K = len(X_train.columns)
data = dict(
N=X_data.shape[0],
K=K,
X_train=X_data.values,
y_train=y_data.values,
landcover_idx=lc_idx,
J=len(np.unique(lc_idx))
)
return data
if __name__ == "__main__":
data = make_dummy_data()
# build model
stan_file = "unpooled.stan"
stan_model = CmdStanModel(stan_file=stan_file)
stan_model.compile()
# fit model
model_fit: CmdStanMCMC = stan_model.sample(
data=data,
chains=4,
parallel_chains=4,
seed=1111,
show_progress=True,
)
I get the following error:
INFO:cmdstanpy:compiling stan file /Users/tommylees/github/LEARNING/Hierarchical-Bayesian-ARDL/unpooled.stan to exe file /Users/tommylees/github/LEARNING/Hierarchical-Bayesian-ARDL/unpooled
ERROR:cmdstanpy:Stan program failed to compile:
WARNING:cmdstanpy:
--- Translating Stan model to C++ code ---
bin/stanc --o=/Users/tommylees/github/LEARNING/Hierarchical-Bayesian-ARDL/unpooled.hpp /Users/tommylees/github/LEARNING/Hierarchical-Bayesian-ARDL/unpooled.stan
Semantic error in '/Users/tommylees/github/LEARNING/Hierarchical-Bayesian-ARDL/unpooled.stan', line 31, column 45 to column 74:
-------------------------------------------------
29: vector[N] y_hat;
30: for (n in 1:N){
31: y_hat[n] = alpha[landcover_idx[n]] + (beta[landcover_idx[n]] * X[n]);
^
32: }
33:
-------------------------------------------------
Ill-typed arguments supplied to infix operator *. Available signatures:
(int, int) => int
(real, real) => real
(row_vector, vector) => real
(real, vector) => vector
(vector, real) => vector
(matrix, vector) => vector
(complex, real) => complex
(complex, complex) => complex
(real, row_vector) => row_vector
(row_vector, real) => row_vector
(row_vector, matrix) => row_vector
(real, matrix) => matrix
(vector, row_vector) => matrix
(matrix, real) => matrix
(matrix, matrix) => matrix
Instead supplied arguments of incompatible type: row_vector, row_vector.
make: *** [/Users/tommylees/github/LEARNING/Hierarchical-Bayesian-ARDL/unpooled.hpp] Error 1
Command ['make', '/Users/tommylees/github/LEARNING/Hierarchical-Bayesian-ARDL/unpooled']
error during processing No such file or directory
I am guessing the key line is this:
Ill-typed arguments supplied to infix operator *. Available signatures:
This refers to the multiplication of my matrix
object (matrix[J, K] beta;
indexed into a row vector
), multiplied by another row vector
(matrix[N, K] X;
)
How can I correctly specify/code the mean for each group? Is it possible to give the same prior to each Beta/Alpha separately? Or would I be better just fitting the model using different data samples?
The reason I am fitting an unpooled model with one .stan
file is because I want to progress from this unpooled estimate to a hierarchical model.