RuntimeError: Initialization failed

Greetings, everybody. Trying to model a probability of goal from a shot using a bernoulli logit model. I have a working implementation in PyStan in Jupyter Notebook, but the sampling isn’t run. This code trhow me an initialization error.

goals_model = """
data {
    int<lower=0> N; // number of observations (8238 solo jugadores con >5) (8451 todos los tiros)
    int players; // number of players (331 solo jugadores con >5) (426 todos los jugadores con tiros)
    int glk; // number of goalkeepers 38
    int zones; // number of field zones 8
    int time; // number of time frames 7
    int res; // types of results (winning, losing, tying)
    int loc; // localia
    int<lower=1,upper=players> player_id[N];
    int<lower=1,upper=glk> glk_id[N];
    int<lower=1,upper=zones> cat_zone[N];
    int<lower=1,upper=time> time_frame[N];
    int<lower=1,upper=res> cat_res[N];
    int<lower=1,upper=loc> localia[N];
    int goal[N]; // dependent variable
parameters {

    real alpha; // intercept
    vector[players] beta_player; // coefficient associated with each player
    vector[glk] beta_glk; // coefficient associated with each goalkeeper
    vector[zones] beta_zones; // coefficient associated with each zone
    vector[time] beta_time; // coefficient associated with each time frame
    vector[res] beta_res; // coefficient associated with each result
    vector[loc] beta_loc; // coefficient associated with each type of localia
    real epsilon; //Uncertainty / unexplained variance
model {
    // priors
    alpha ~ normal(0,1);
    beta_player ~ beta(3.44,7.34);
    beta_glk ~ normal(-0.05,0.38);
    beta_zones ~ normal(-0.17,0.58);
    beta_time ~ normal(-0.23,0.37);
    beta_res ~ normal(-0.5,0.5);
    beta_loc ~ normal(-0.78,0.6);
    goal ~ bernoulli_logit(alpha + beta_player[player_id] + beta_glk[glk_id] + 
        beta_zones[cat_zone] + beta_time[time_frame] + beta_res[cat_res] + beta_loc[localia]);

When i tried with this settings, the initialization it does not start.

goal_fit = goal_reg.sampling(data=datos,
                          iter=2000, chains=4,
                          warmup=500, n_jobs=-1,

I hope you can help me.

You need to set bounds on some of your parameters. The first that jumps out to me is beta_player – your beta distribution prior restricts this parameter to be between 0 and 1. In your parameters block, you likely should have vector<lower=0,upper=1>[players] beta_player;.

When the initialization starts, it starts the sampler in range across the reals by default. As you have no bounds on beta_player, it tries to sample values outside of the support of the beta distribution and throws errors due to the conflict.

Also, you define a parameter epsilon, did you intend to use that in your model?

1 Like

Thanks, i will try again. Do you know if the settings of “n_jobs” or “seed” affects?

Those parameters should not affect this error.