Latent class model estimation with long-format data

bbbales2 · September 5, 2017, 8:24pm

I think the two for loops at the end are different. This (in the long form model) is not what you wrote in the wide format:

// Define model
  for (n in 1:N) {
    for (c in 1:2) {
      ps[c] = log_nu[c] + (y[n] * log(pi[ii[n],c]) + (1 - y[n]) * log(1 - pi[ii[n],c]));
    }
    target += log_sum_exp(ps);
  }

This is what the long-format code would look like if it were in the wide format, which seems wrong based on the number of times target gets incremented (the long-format probability is the product of I * J things, the wide-format is just J things). Maybe this is an easier way to see the difference?

for (j in 1:J) {
  for (i in 1:I) {
    for (c in 1:2) {
      ps[c] = log_nu[c] + y[j,i] * log(pi[i,c]) + (1 - y[j,i]) * log(1 - pi[i,c]);
    }
    target += log_sum_exp(ps);
  }
}

What you want is the other direction though, haha. There’s probably a ragged array way of working with this.

Probably start by sorting your y data first by blocks of j, and then building a couple new arrays (calling them s and l here) that are the start index and length of the blocks of j. For example:

jj: 1 1 1 1 2 2 2 3 3 3 3
ii: 1 2 5 7 1 2 3 2 5 7 8
s: 1 5 8
l:  4 3 4
y: ...

And then write something like:

for(k in 1:size(s)) {
  int j = jj[s];
  for(c in 1:2) {
    real log_items[l[k]];
    for(m in s[k] : s[k] + l[k]) {
      int i = ii[m];
      log_items[m] = y[k] * log(pi[i,c]) + (1 - y[k]) * log(1 - pi[i,c]);
    }
    ps[c] = log_nu[c] + sum(log_items);
  }
  target += log_sum_exp(ps);
}

Look for the ragged array examples in the manual. This stuff can be confusing. There’s always little annoying issues with things like the log_items array would be different sizes for each j (not 100% sure what I wrote would work – but you could make it an array of length max_size where max_size is the maximum length block of j indexes and just be careful about rezeroing it).

Hope that helps! Thanks for getting the format really nice on your question. Makes it a lot easier to parse what’s going on.

Topic		Replies	Views
Help with understanding my results Modeling fitting-issues , specification , performance	5	1188	February 8, 2019
Two questions about using rstan to estimate the Item response theory model parameters Modeling techniques	2	660	July 7, 2019
Identifying latent classes in longitudinal data Modeling techniques , fitting-issues , brms	2	84	October 25, 2024
Model Poll data (Categorical Likelihood with Dirichlet Prior) Modeling rstan	1	324	April 21, 2023
How to fit Latent Class Regression using STAN Modeling	5	465	December 14, 2023

Latent class model estimation with long-format data

Related topics