Help for setting parameters to solve non-identifiability problem

#1

Hi,

I’m working in some logits that shares parameters. By the utility function this model has non-identifiability problems that is solved fixing one of the element of the vector parameter beta (to one in this case) and a second element to be higher than the fixed one. I try to aplly that, but it seems don´t work fine for me.

Here the model specification:

``````data {
int<lower=1> NRUBROS; //number of independent variables
int<lower=0> NCASOS; //number of observations
int<lower=0,upper=1> y[NCASOS,NRUBROS]; //dependent variables
vector[NRUBROS] x[NCASOS]; //covariates
}

transformed data{
vector[NCASOS] sumd;

for (i in 1:NCASOS) {
sumd[i] = fmax(sum(x[i]),1); //auxiliar data
}
}

parameters {
vector[NRUBROS-2] beta_free; //free parameters
real<lower=1> beta_free_aux; //to constrain the second element
}

transformed parameters {
vector[NRUBROS] beta; //to append vector of parameters

beta = append_row(1,append_row(beta_free_aux,beta_free)); //create vector beta

}

model {

for (j in 1:NRUBROS){
for (i in 1:NCASOS){
utilidad[i, j] = fabs(beta[j] - dot_product(beta,x[i])/sumd[i]); //utility definition
}
}
{
for(i in 1: NCASOS){
}
}

to_vector(beta_free) ~ normal(0,5);
beta_free_aux ~ normal(0,5);
}
``````

The problem is, for example for a 4 catogories simulation, with a real value of the parameter beta = (-1,-2,1,0) there is only one solution that respect the fixed and constrained parameters = (1,2,-1,0). But, looking the traceplot, you can see that three chains converge to the desired solution and the other one converge to a solution like (1,1,3,2) that looks very similiar to the plausible solution (1,0,3,2) with the diference that the second parameter is constrained to be higher than one. Then, i think the free parameters are be able to move in the space solution with the two first elements of beta don´t constrained and i don´t like that. Any idea to solve it?

Regards and like always sorry for my english!

simulation code: Simulation (MODELO 9B DM MEAN).R (1.7 KB)

Many issues about a Multivarite Probit purchase model: data simulation, reparameterization and speedup
#2

Just a quick idea: AFAIK Stan has problem not only when there are multiple equivalent solutions, but also when there are multiple local maxima of posterior density. It seems, that there is a local maximum at (1,1,3,2) - you can check if this is the case by plotting the betas against log_prob (manually or via shinystan’s explore - bivariate/trivariate plot function).

Note that prior on beta_free_aux has a maximum at 1 and it is possible that this maximum is not dampened by data. You may try setting a prior on beta_free_aux that has zero density at 1, I remember that Michael used inverse Gamma for a similar problem (https://betanalpha.github.io/assets/case_studies/gp_part3/part3.html see section 4), but Gamma might be good as well. Alternatively you may try if the problem persists if you simulate a larger dataset.

#3

That’s correct.

This is almost certainly wrong, as it bounds `pbabilidad[i, j]` below at 0.5 by bounding the log odds below at 0. You probably just want to get rid of that `fabs`. Whenever you see `fabs` applied to a (function of a) parameter, it’s going to be problematic for identifiability.