Regularization Factor Models

zsr · February 15, 2023, 4:01pm

Hi,

I am trying to fit a factor model with a sparse non-negative loading matrix. To model the sparsity, I am using a mixture of two gamma distributions to enforce the non-negativity.

data {
  int<lower=0> N; //number of students
  int<lower=0> NY; //number of items
  int<lower=0> N_eta; //number of skills
  real a; //set to 10 for the spike variable
  
  array[N, NY] int<lower = 0, upper = 1> Y; //dichotomously scored items in wide format
}

parameters {
  //Q-matrix
  matrix<lower = 0, upper = 1>[NY, N_eta] theta; //r_k 
  matrix<lower = 0>[NY, N_eta] Z_spike; 
  matrix<lower = 0>[NY, N_eta] Z_slab;   
  
  //Concept Knowledge Map i.e. Factors
  matrix[N, N_eta] Z_eta; //MVN(0, I)
}

transformed parameters{
  
  matrix<lower = 0>[NY, N_eta] W;  
  {
    vector[NY * N_eta] vec_theta = to_vector(theta);
    vector[NY * N_eta] vec_slab = to_vector(Z_slab);
    vector[NY * N_eta] vec_spike = to_vector(Z_spike);   
    
    vector[NY * N_eta] W_vec = vec_theta .* vec_slab + (1 - vec_theta) .* vec_spike; 
    
    W = to_matrix(W_vec, NY, N_eta);
  }  
}


model {
  matrix[N, NY] mu = Z_eta*W';

  to_vector(theta) ~ beta(1, 1.5); //uniform(0, 1);
  to_vector(Z_slab) ~ gamma(2,2); //gamma(5, 6); Trying gamma(2, 2) also
  to_vector(Z_spike) ~ gamma(a, 1/a);
  
  for(k in 1:N_eta){
    Z_eta[, k] ~ std_normal(); 
  }
  
  target += bernoulli_logit_lupmf(to_array_1d(Y) | to_vector(mu));

}

However, my estimates of the loadings from W are inaccurate. Any recommendations on how to model the sparse loading would be greatly appreciated.

Thank you!

zsr · February 22, 2023, 2:13pm

I have included my data generation code

factor_reg <- function(N_Student, N_Item, N_Skill, seed = 1234, 
                          perc_miss){
    
    set.seed(seed)
    Q <- Matrix::rsparsematrix(nrow = N_Item, ncol = N_Skill, density = 1 - perc_miss, 
                               rand.x = \(n) runif(n, min = 0.1, max = 1.5)) |> 
        as.matrix()
    
    
    student_prof <- mvtnorm::rmvnorm(n = N_Student, mean = rep(0, N_Skill), sigma = diag(1, N_Skill))
    
    nu <- tcrossprod(x = student_prof, y = Q)
    Y <- matrix(999, nrow = N_Student,ncol = N_Item)
    for(j in 1:N_Item){
        Y[, j] <- rbinom(n = N_Student, size = 1, prob = plogis(nu[, j]))
    }
    
    list(Y = Y, Q_matrix = Q, N_eta = N_Skill, NY = N_Item, N = N_Student)
    
}

edm · February 22, 2023, 10:39pm

I haven’t tried to run the model, but it looks like you are putting a spike and slab on every entry of W and hoping that enough entries stay at 0 to keep the model parameters identified. I think this will cause problems where different skills switch between different columns of W across iterations of the MCMC, and there could also be rotational indeterminacy problems.

If I was coding this, I would get the model working with traditional priors (not spike/slab) on entries of W, then move to spikes and slabs after that. To identify the loadings, it is traditional to take the top N_eta \times N_eta square of W, the fix the entries in the upper triangle to 0. You might not want to do that, but I am guessing that this is why you have problems with the estimates of W. The papers below describe some related issues and propose some new solutions.

zsr · February 24, 2023, 3:05pm

Thank you very much for the resources. I will go through them and make the necessary changes to the model.

Topic		Replies	Views
Sampling from sparse models with symmetries Modeling fitting-issues	2	570	April 15, 2018
Incorrectly bimodal posterior Modeling techniques , fitting-issues	7	328	March 22, 2024
Inducing sparsity in multivariate system with map estimation General	7	608	October 1, 2020
Sparse infinite factor priors Modeling	14	1404	June 12, 2018
Constraining Latent Factor model / baysian probabalisic matrix factorization to remove multimodality Modeling techniques , fitting-issues , specification	27	5370	March 17, 2020

Regularization Factor Models

Related topics