Map_rect implementation compiling but getting infinities in parameters


#1

Hi all. So after much much hard work I finally got a map_rect implementation to complie successfully. Yay!

Alas, Although it complies I’m getting the errors below and I cannot figure out why:

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: multi_normal_cholesky_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model79a33b6c20b_hier_mvn_map_rect' at line 124)

Chain 1: 
Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."

My model is as follows:


functions{
    vector mu_map ( vector phi , vector theta , real[] x_r , int[] x_i ){
        int NvarsX = x_i[1];
        int NvarsY = x_i[2];
        int N = x_i[3];
        int N_ind = x_i[4];
        int indiv[N] = x_i[ 5: 4 + N];

        matrix[N, NvarsX] x = to_matrix( x_r[N+1: num_elements(x_r)], N, NvarsX);
        vector[N] y = to_vector(x_r[1:N]);
        real sub_mu[N];
        
        vector[N_ind] b0_ind = theta[1:N_ind];
        matrix[NvarsX, N_ind] b_ind = to_matrix(theta[ (N_ind + 1): num_elements(theta) ], NvarsX, N_ind );

        print(b0_ind);
        for (i in 1:N){
            sub_mu[i] = b0_ind[indiv[i]] +
                                    dot_product( to_vector( b_ind[1:NvarsX, indiv[i]] ), x[i, 1:NvarsX] );
        }

        return to_vector(sub_mu);
    }
}

data{
    int<lower=0> NvarsX;     // num independent variables
    int<lower=0> NvarsY;     // num dependent variables
    int<lower=0> N;          // Total number of rows
    int<lower=0> N_ind;      // Number of individuals/participants
    
    int<lower=1> indiv[N];   // data to identify individuals
    matrix[N, NvarsX] x;     // data for independent vars
    vector[NvarsY] y [N];    // data for dependent vars
}

transformed data{
    // Num shards = NvarsY
    real x_r[NvarsY, (NvarsX + 1)];
    int x_i[NvarsY, 4 + N];
    {
        for (k in 1:NvarsY) {
            x_r[k] = to_array_1d(append_col( to_vector(y[1:N, k]) , x));
            x_i[k, 1] = NvarsX;
            x_i[k, 2] = NvarsY;
            x_i[k, 3] = N;
            x_i[k, 4] = N_ind;
            x_i[k, 5: 4 + N] = indiv;
        }
    }
}

parameters{
    cholesky_factor_corr[NvarsY] L_Omega; //For the correlations between outcome variables
    vector<lower=0>[NvarsY] L_sigma;      //Residual SD of outcome variables
    
    matrix[NvarsY, N_ind] Zbeta0_ind;         //Intercepts at individual level
    matrix[NvarsX, N_ind] Zbeta_ind [NvarsY]; //Coefficients at the individual level
    vector<lower=0> [NvarsY] Tau0;           //Random effect for individual intercepts
    matrix<lower=0> [NvarsY, NvarsX] Tau;    //Random effect for indvidiual coefficients

    vector[NvarsY] Beta0;         // Level 2 intercepts for each Y
    vector[NvarsX] Beta [NvarsY]; // Level 2 coefficients for each X vs each Y
    
    vector[0] phi;
}

transformed parameters{
    vector[NvarsY] mu[N];
    matrix[NvarsY, NvarsY] L_Sigma;
    matrix[NvarsY, N_ind] beta0_ind;
    matrix[NvarsX, N_ind] beta_ind [NvarsY];
    
    vector[(NvarsX + 1) * N_ind] theta [NvarsY];

    L_Sigma = diag_pre_multiply(L_sigma, L_Omega);
    
    // Define Individual level betas - non parametric specification
    for (s in 1:N_ind){
        for (dv in 1:NvarsY){
            beta0_ind[dv, s] = Zbeta0_ind[dv, s] * Tau0[dv] + Beta0[dv];
            for (iv in 1:NvarsX){
                beta_ind[dv, iv, s] = Zbeta_ind[dv, iv, s] * Tau[dv, iv] + Beta[dv, iv];
            }
        }
    }

    // Define level 2 betas
    for (dv in 1:NvarsY){
        // pack up betas into thetas
        theta[dv] = append_row( to_vector(beta0_ind[dv, ]), to_vector( beta_ind[dv, 1:NvarsX, ] ));
        
        mu[dv] = map_rect(mu_map, phi, theta, x_r, x_i);
        //for (i in 1:N){
        //    mu[i, dv] = beta0_ind[dv, indiv[i]] +
        //                            dot_product( to_vector( beta_ind[dv, 1:NvarsX, indiv[i]] ), x[i, 1:NvarsX] ); 
        //}
    }
}

model{
    // Priors
    L_Omega ~ lkj_corr_cholesky(1);
    L_sigma ~ normal(0, 1);
    
    Tau0 ~ normal(0,1);
    to_vector(Tau) ~ normal(0,1);
    
    for (s in 1:N_ind){
        for (dv in 1:NvarsY){
            Zbeta0_ind[dv, s] ~ normal(0,1);
            Zbeta_ind[dv, 1:NvarsX, s] ~ normal(0,1);
        }
    }
    
    to_vector(Beta0) ~ cauchy(0, 5);
    //for( dv in 1:NvarsY){
    //    to_vector(Beta[dv, 1:NvarsX]) ~ cauchy(0, 5);
    //}

    // Likelihood
    for (i in 1:N){
        y[i, 1:NvarsY] ~ multi_normal_cholesky(mu[i, 1:NvarsY], L_Sigma);
    }
}

generated quantities{
    matrix[NvarsY, NvarsY] Omega;
    matrix[NvarsY, NvarsY] Sigma;
    Omega = multiply_lower_tri_self_transpose(L_Omega);
    Sigma = quad_form_diag(L_Omega, L_sigma);
}

Using the print statement in the functions block, I was able to confirm that x_r and x_i terms are imported into mu_map correctly. However printing the b0_ind term I see the following:

Chain 1: [-2.03963,2.02136,-3.26791,-1.98754,-6.09082,9.94062,5.08269,-7.39787,4.46367,6.32371,5.37206,1.02144,1.49216,9.39711,1.96056,-2.45818,-6.5234,7.84542,-7.85297,-7.81026,-6.64295,-7.96385,-4.25369,0.0278565,1.40671,3.51382,2.4045,9.83507,0.806054,0.00315201,-5.17814,-2.64519,4.59639,-0.348448,1.42004,-5.13239,-4.21424,1.60464,-5.758,-8.33049]
[nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan]
[-2.03963,2.02136,-3.26791,-1.98754,-6.09082,9.94062,5.08269,-7.39787,4.46367,6.32371,5.37206,1.02144,1.49216,9.39711,1.96056,-2.45818,-6.5234,7.84542,-7.85297,-7.81026,-6.64295,-7.96385,-4.25369,0.0278565,1.40671,3.51382,2.4045,9.83507,0.806054,0.00315201,-5.17814,-2.64519,4.59639,-0.348448,1.42004,-5.13239,-4.21424,1.60464,-5.758,-8.33049]
[-0.387661,0.219772,-0.100753,-0.00231079,-0.244318,-0.677263,-0.523533,-0.695745,-0.362962,0.0617946,-0.0995644,-0.46076,-0.416219,0.371938,-0.495103,-0.714952,0.300096,0.287193,0.308052,0.124873,0.156031,-0.643479,-0.0339391,0.117795,-0.598271,-0.459788,-0.743963,0.188224,0.272234,-0.480171,-0.550248,-0.515776,0.228236,-0.614419,-0.124229,-0.135257,-0.395676,-0.656919,-0.403698,-0.736666]

I don’t know where the row of nan's is coming from. I’ve been over the code a few times and cannot spot any indexing errors (although they might be there and I’m simply missing them). The length of each vector is 40 which is true to the data.

Can anyone help here ?

This will generate a toy dataset if anyone wants one: gen_data_y2_x3.R (793 Bytes)


#2

I would suggest to call expose_stan_functions and call mu_map from R until you can figure out what the problem is. Also, it can be a good idea to put <upper = positive_infinity()> in declarations at the top of the parameter block so that NaNs are caught there, rather than down in the model block.


#3

Thanks for those suggestions. I’ll look into them for future use.

For this problem I’ve fixed one issue, but I’m very confused by another. For everytime I’m calling the map_rect function, it is being run twice? Why is that happening ?


#4

No clue