Trouble getting Multi threading to work for a multi-variable linear model


#1

Hi,

I’m having trouble creating a multi-variable 2-level linear model using the map_rect function. I’ve used @bnicenboim’s example code from Multi threading version of a simple linear mixed model: y ~ cond + (cond||subj) (And a big thank you for posting it - it’s been a huge help).

My model has a number of population level effects and a couple of random effects y ~ age_function + mileage_function + month_of_year + production_year + condition + used_car_cpi_score + (age_function + mileage_function||vehicle_map)

I’ve managed to get it to work with the additional random effects and a couple of population effects (intercept and single variable) but when I run it with more than one population variable it fails.

The error message I get looks like the below

Unrecoverable error evaluating the log probability at the initial value.
Exception: Exception: array[multi,...] index: accessing element out of range. index 531 out of range;expecting index to be between 1 and 530  (in 'multithreading/two_level_model.stan' at line 12)
  (in 'multithreading/two_level_model.stan' at line 119)

Line 12 is part of the function block

 vector mt(vector phi , vector theta , real[] Y_X_r , int[] n_i ) {
    
    //data
    int N = n_i[1]; // number of items in each group (vehicle map) 
    int K = n_i[2]; // number of columns in the X matrix variables 
    int Kc = K - 1;  //re-creates Kc within the function
    int M_1 = rows(theta); // number of random effects/group level effects (r.e.)
    vector[N] Y = to_vector(Y_X_r[1:N]); //dependent variable ***LINE 12***
    matrix[N, Kc] Xc = to_matrix(Y_X_r[N + 1:N + N * Kc], N, Kc); // extract population level data and convert to a matrix  
    vector[N] Z_1_1 = to_vector(Y_X_r[N + N * Kc + 1: 2 * N + N * Kc]); //Predictor 1 for the r.e. model matrix (all 1s here)
    vector[N] Z_1_2 = to_vector(Y_X_r[2 * N + N * Kc + 1: 3 * N + N * Kc]); //Predictor 2 for the r.e. model matrix 
    vector[N] Z_1_3 = to_vector(Y_X_r[3 * N + N * Kc + 1: 4 * N + N * Kc]); //Predictor 3 for the r.e. model matrix 

And line 119 refers to the map_rect function in the model block

// likelihood including all constants 
    //using function "mt" with transformed parameter phi, parameter theta, data per shard, no of obs and pars per shard
    target += sum(map_rect(mt, phi, theta, Y_X_r, n_i));

The vector V_J_r is created in the transformed data block

{
    int pos = 1;
    int end = 0;
    for (n in 1:N_1) {  //for each group
    n_i[n] = {JN[n], K}; //use shard size anf no of variables for each group
    end = pos + n_i[n,1] - 1; // what position to stop
    //append row can only be done in pairs so will need to nest them
    Y_X_r[n] = to_array_1d(  // convert to a row_major format
    append_row(append_row(Y[pos:end], to_vector(Xc[pos:end, Kc])), //creates an array with the Y and Xc data for the shard
    append_row(Z_1_1[pos:end], append_row(Z_1_2[pos:end], Z_1_3[pos:end]))) // adds the data within each group
    );
    pos += n_i[n,1]; 
    } 
    }

It looks like the array I’m creating is too big for whatever I’ve indexed but for the life of me I can’t figure out what to change. I’ve been going a round in circles for the last couple of days.

Can anyone see where I’ve gone wrong? The full code and data are attached.

2_level_multi_variable.R (831.6 KB)
2_level_model.stan (5.1 KB)

Thanks,
Michael


#2

Solved it.

I needed to increase the number of columns in the Xc matrix when creating the shards. Eejit!!

    for (n in 1:N_1) {  //for each group
    
	n_i[n] = {JN[n], K}; //use shard size anf no of variables for each group

    	end = pos + n_i[n,1] - 1; // what position to stop
	
	
    //append row can only be done in pairs so will need to nest them
    Y_X_r[n] = to_array_1d(  // convert to a row_major format
    append_row(append_row(Y[pos:end], 
				to_vector(Xc[pos:end, 1:Kc])), // Originally only looking at one column (Kc) rather than columns 1 to Kc
					append_row(Z_1_1[pos:end], 
							append_row(Z_1_2[pos:end], 
											Z_1_3[pos:end]))) // adds the data within each group
    );
    pos += n_i[n,1]; 

    } 

#3

Just to clarify - this is a multivariable model as in multiple independent X variables - not a multivariate model as in multiple dependent Y variables /multiple outcomes) ?


#4

It is. Changing the title. Cheers