Help for setting parameters to solve non-identifiability problem

specification

#1

Hi,

I’m working in some logits that shares parameters. By the utility function this model has non-identifiability problems that is solved fixing one of the element of the vector parameter beta (to one in this case) and a second element to be higher than the fixed one. I try to aplly that, but it seems don´t work fine for me.

Here the model specification:

data {
  int<lower=1> NRUBROS; //number of independent variables
  int<lower=0> NCASOS; //number of observations
  int<lower=0,upper=1> y[NCASOS,NRUBROS]; //dependent variables
  vector[NRUBROS] x[NCASOS]; //covariates 
}

transformed data{
  vector[NCASOS] sumd;
  
  for (i in 1:NCASOS) {
    sumd[i] = fmax(sum(x[i]),1); //auxiliar data
  }
}

parameters {
  vector[NRUBROS-2] beta_free; //free parameters
  real<lower=1> beta_free_aux; //to constrain the second element
}

transformed parameters {
  vector[NRUBROS] beta; //to append vector of parameters
  
  beta = append_row(1,append_row(beta_free_aux,beta_free)); //create vector beta

}

model { 
  vector[NRUBROS] utilidad[NCASOS]; //utility matrix
  vector[NRUBROS] probabilidad[NCASOS]; //probability matrix 

  for (j in 1:NRUBROS){
    for (i in 1:NCASOS){ 
      utilidad[i, j] = fabs(beta[j] - dot_product(beta,x[i])/sumd[i]); //utility definition
      probabilidad[i, j] = inv_logit(utilidad[i, j]);
    }
  }     
  { 
    for(i in 1: NCASOS){
      y[i] ~ bernoulli(probabilidad[i]);
    }
  }

  to_vector(beta_free) ~ normal(0,5);
  beta_free_aux ~ normal(0,5);
}

The problem is, for example for a 4 catogories simulation, with a real value of the parameter beta = (-1,-2,1,0) there is only one solution that respect the fixed and constrained parameters = (1,2,-1,0). But, looking the traceplot, you can see that three chains converge to the desired solution and the other one converge to a solution like (1,1,3,2) that looks very similiar to the plausible solution (1,0,3,2) with the diference that the second parameter is constrained to be higher than one. Then, i think the free parameters are be able to move in the space solution with the two first elements of beta don´t constrained and i don´t like that. Any idea to solve it?

Rplot%20beta%203%20converge%201%20not

Regards and like always sorry for my english!

simulation code: Simulation (MODELO 9B DM MEAN).R (1.7 KB)


Many issues about a Multivarite Probit purchase model: data simulation, reparameterization and speedup
#2

Just a quick idea: AFAIK Stan has problem not only when there are multiple equivalent solutions, but also when there are multiple local maxima of posterior density. It seems, that there is a local maximum at (1,1,3,2) - you can check if this is the case by plotting the betas against log_prob (manually or via shinystan’s explore - bivariate/trivariate plot function).

Note that prior on beta_free_aux has a maximum at 1 and it is possible that this maximum is not dampened by data. You may try setting a prior on beta_free_aux that has zero density at 1, I remember that Michael used inverse Gamma for a similar problem (https://betanalpha.github.io/assets/case_studies/gp_part3/part3.html see section 4), but Gamma might be good as well. Alternatively you may try if the problem persists if you simulate a larger dataset.


#3

That’s correct.

This is almost certainly wrong, as it bounds pbabilidad[i, j] below at 0.5 by bounding the log odds below at 0. You probably just want to get rid of that fabs. Whenever you see fabs applied to a (function of a) parameter, it’s going to be problematic for identifiability.