model.stan (2.2 KB)

Hi there! :wave:

I’m new to stan and I could use a bit of feedback on this model. Is there anything big I’m overlooking? Does something stand out to you?

```
data {
int<lower=1> J; // number of apps
int<lower=3> L; // number of years
int<lower=1> N; // number of observations, the rows
int K; // number of features, the columns
int<lower=0> y[N]; // dependent variable index by the number of rows
matrix[N, K] X; // trend covariates
matrix[N, K] X2; // action covariates
matrix[N, L] X3; // year covariates
int IDs[N]; // apps
}
parameters {
real mu_alpha;
real<lower=0> tau_alpha; // hyper parameter for alpha variance
// I split out the intercept from other paraters in the design matrix;
vector[J] alpha; // the intercepts indexed by J apps
vector[K] beta; // a feature parameter K in length
vector[L] year; // a feature parameter L years in length
real mu_year;
real<lower=0> tau_year;
vector<lower=0>[K] action_slope; // a feature parameter K in length
vector<lower=0>[J] theta;
real<lower=0> mu_theta;
real<lower=0> tau_theta;
real mu_action_slope;
real<lower=0> tau_action_slope;
real mu_beta;
real<lower=0> tau_beta;
}
model {
mu_year ~ normal(0.3, 0.5);
tau_year ~ normal(0.8, 0.2);
year ~ normal(mu_year, tau_year);
mu_theta ~ normal(1, 5);
tau_theta ~ normal(3.6, 0.25);
theta ~ normal(mu_theta, tau_theta);
mu_alpha ~ normal(1, 2);
tau_alpha ~ normal(2.3, 0.5);
alpha ~ normal(mu_alpha, tau_alpha);
mu_beta ~ normal(0.07, 0.03);
tau_beta ~ normal(0.07, 0.04);
for(k in 1:K) {
beta[k] ~ normal(mu_beta, tau_beta);
}
mu_action_slope ~ normal(0, 0.1);
tau_action_slope ~ normal(0.12, 0.8);
for(k in 1:K) {
action_slope[k] ~ normal(mu_action_slope, tau_action_slope);
}
y ~ neg_binomial_2_log(alpha[IDs] + X * beta + X2 * action_slope + X3 * year, theta[IDs]);
}
generated quantities {
real predicted_tasks[N];
real marginal_tasks[N];
matrix[N, K] X2_extra_actions;
for(n in 1:N) {
for(k in 1:K) {
if(X2[n, k] > 0) {
X2_extra_actions[n, k] = X2[n, k] + 1;
} else {
X2_extra_actions[n, k] = 0;
}
}
predicted_tasks[n] = exp(alpha[IDs[n]] + X[n] * beta + (X2[n]) * action_slope + X3[n] * year);
marginal_tasks[n] = exp(alpha[IDs[n]] + X[n] * beta + (X2_extra_actions[n]) * action_slope + X3[n] * year);
}
}
```

A few things I’ve learned so far:

- Shorten iter to 1200 and set chain = 1 when prototyping
- Divergences can be caused by non-sensical priors
- Divergencies can occur when you need a hyperparameter but instead have hard-coded a parameter
- hyperparameters can be defined over a for loop of vectors
- use_cache = FALSE with pars to avoid really slow wait times for print via summary.
- If there are divergences look for the params with low n_eff and R_hat > 1 then focus on their prior, also consider if a hyperparameter is needed.
- bayesplot::ppc_ecdf_overlay is also a good way to find gross model errors that likely lead to divergences.
- One of the most important things to do is “center” or “scale” your data. Basically do what you can center and scaling wise such that the posterior can be as close to normal(0, 1) as possible. You want the posterior to be easy to explore and don’t want to be estimating any really large parameters like beta(1, 1e6) for example.

If I’ve got any of the above wrong corrections would be appreciated. Thank you.