Identification problems: Unable to recover parameters in Multinomial IRT

I was curious and had a look – thank you for sharing a nice self-contained example.

There is a problem with the python code to simulate the data! The data is accidentally being generated by a somewhat different process than the intended model, which may account for some of the identifiability problems.

If we display the resulting shape (I, J) matrix DTM of word counts, we see:

array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0., 98.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0., 88.],
       [ 0., 70.,  0.,  0.,  0.,  0.,  3.,  0.,  3., 14.],
       [ 2.,  0.,  3.,  0.,  1.,  0.,  0.,  5.,  0., 86.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  1.,  0.,  0., 96.]])

the counts for the last word j=9 are much larger than the other words, which is strange. The counts for words j=3 and j=5 are uniformly zero for all documents, so that may be a cause of some of of the non identifiability.

The very high counts for j=9 is strange as it doesn’t seem like your script is trying to generate such assymetric synthetic data – it turns out there’s an error with this line:

pi  = softmax(np.transpose(psi) + np.multiply(theta,np.transpose(beta)))

the input argument to softmax is a shape (I, J) array, but by default scipy.special.softmax performs the operation over the entire flattened array of size I*J:

axis int or tuple of ints, optional
Axis to compute values along. Default is None and softmax will be computed over the entire array x .
scipy.special.softmax — SciPy v1.11.4 Manual

This produces an output pi where the sum over both dimensions is 1, so then each time the script samples from the multinomial distribution with np.random.multinomial(C[i],pi[i,:]) , the pi[i,:] array has total probability much less than 1. The documentation of np.random.multinomial says this about the second argument pvals=pi[i,:] :

Probabilities of each of the p different outcomes. These must sum to 1 (however, the last element is always assumed to account for the remaining probability, as long as sum(pvals[:-1]) <= 1) .
numpy.random.multinomial — NumPy v1.26 Manual

So I think this explains how the problem with the softmax accidentally being over both dimensions I and J causes the symptom of very high counts when sampling the J-1th word.

When we fix the softmax to operate over the word dimension J by explicitly adding the axis=1 argument, then the script produces a new matrix DTM of word counts like so:

array([[ 5., 73.,  0.,  0.,  0.,  1., 17.,  1.,  1.,  0.],
       [33., 26.,  2.,  5.,  0.,  1., 16.,  3.,  0.,  2.],
       [ 0., 90.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [13.,  0., 48.,  0.,  5.,  0.,  0., 31.,  0.,  0.],
       [40., 36.,  2.,  0.,  1.,  1., 14.,  3.,  0.,  1.]])

That looks better! Here is the fixed version of the data generating script, with some sanity checks added:

import numpy as np
import random
from scipy.special import softmax

np.random.seed(1)

I = 5
J = 10
N = J * I
C = np.random.poisson(lam = 100, size = I)

theta = np.random.normal(size = I) 
theta = np.expand_dims(theta,axis = -1)
beta  = np.random.normal(scale = 3, size = J)
beta  = np.expand_dims(beta,axis = -1)
psi   = np.random.normal(scale = 3, size = J)
psi   = np.expand_dims(psi,axis = -1)

arg = np.transpose(psi) + np.multiply(theta,np.transpose(beta))

# for each document i in I, softmax over the word axis J
pi = softmax(np.transpose(psi) + np.multiply(theta,np.transpose(beta)), axis=1)

# sanity check: for each document i in I, we expect sum_{j in J} pi[i, j] = 1
assert pi.shape == (I, J)
assert np.all(np.isclose(np.sum(pi, axis=1), np.ones(shape=(I, ))))
assert np.all(pi >= 0.0)

DTM = np.empty((I,J))
for i in range(0,I):
    rhs = np.random.multinomial(C[i],pi[i,:])
    DTM[i,:] = np.random.multinomial(C[i],pi[i,:])

Re: the Stan model, I haven’t tried running it, but a couple of quick suggestions:

  1. the docs[N] and word[N] data inputs appear to be unused. If you don’t need them, it’d make the model easier to understand if the unused bits were completely removed.
  2. re: the earlier thread on reflection invariance, there’s a post from @Marko at the end attaching two different solutions. It looks like your model is based on the first solution. Have you tried out the second solution wordfish_ident.stan to see if that perhaps works better? Wordfish model (aka poisson ideal point model, poisson or multinomial IRT model): Fixing reflection invariance - #11 by Marko
2 Likes