Modeling an ordered vector scaled by unknown parameter

This is a silly question, but I have tried different reparametrizations and priors without any success.
I have an ordered data vector as an input. This ordered data vector is scaled by an unknown parameter theta. When I give Stan the true value of theta, I can correctly infer the parameters delta and T associated with the distribution of the unscaled data vector(divided by theta).

data{
  int<lower=2> n;
  vector<lower = 0.0>[n] sorted_vector_scaled_by_theta;
}
parameters{
  real<lower=0.001> theta;
  real<lower=0> hyper_parameter;
  real<lower=0> delta;
  real<lower=0>T;
}
model{
  theta~exponential(10);
  
//log_theta= log(theta);
//log_theta ~ uniform(1, 1000);
 //theta~lognormal(20, 100);
 //log_theta~student_t(3,1,3);
//target += -log_theta;

  hyper_parameter~gamma(0.001, 0.001);
  delta~exponential(hyper_parameter);
  target +=log_likelihood(n,  delta,  T, (1.0/theta)* sorted_vector_scaled_by_theta);
}

Where log_likelihood is custom likelihood function of sorted_vector_scaled_by_theta given delta, n and T.

I have tried different priors for theta: uniform, log uniform, log normal,… but none of those can recover the true values of delta and T. Can anyone help me with this modeling problem?.

P:D: I have thought to model log(sorted_vector_scaled_by_theta) instead and then every element is shifted in the same value: log(theta)

Assuming that domain expertise does not permit you to place a highly informative prior on theta, the solution almost surely depends on what your function log_likelihood is. Perhaps when theta is a free parameter the model is not well identified?

1 Like

The function log_likelihood is just a loop over the unscaled sorted_vector_scaled_by_theta(that is why I divided by theta before passing the vector). The funny thing is that the model runs and converges:

Inference for Stan model: stan_model_one_population_theta.
4 chains, each with iter=5000; warmup=2500; thin=1; 
post-warmup draws per chain=2500, total post-warmup draws=10000.

                                     mean se_mean     sd   2.5%    25%    50%    75%   97.5% n_eff Rhat
hyper_parameter  229.38    5.70 424.95   0.95  16.80  68.38 248.46 1446.66  5562    1
delta                            0.02    0.00   0.05   0.00   0.00   0.01   0.03    0.16  4297                    1
theta                           7.84    0.01   0.67   6.58   7.38   7.82   8.28    9.24  4947                     1
T                                   56.24    2.71 167.52   5.31  12.46  23.17  48.05  320.94  3831         1
lp__                           -87.94    0.02   1.47 -91.70 -88.64 -87.59 -86.86  -86.18  3691         1

Samples were drawn using NUTS(diag_e) at Tue May  4 16:36:48 2021.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Bu the values are wrong, from my true simulated data theta=200, delta =100, T=0.042.
If I use other priors choice I get parameter value like 10^30 and Rhat of Nan.

Does this imply non identifiability?

I still don’t understand what log_likelihood is doing.

In general, if changing the priors causes huge swings in your inference, it suggests that you might have an identifiability problem.

Maybe I can illustrate the sort non-identifiability I’m talking about with a very simplified example. Suppose we have a data vector x. And suppose that we wish to model x as normally distributed with mean mu and standard deviation sigma, where both mu and sigma are to be estimated from the data:

data {
  int<lower=0> N;
  vector[N] x;
}

parameters {
  real mu;
  real<lower=0> sigma;
}

model {
  mu ~ ... // prior for mu
  sigma ~ ... //prior for sigma
  x ~ normal(mu, sigma);
}

Easy enough.

But now suppose that instead of data x I have data x_scaled, which are scaled by some unknown multiplicative constant inv_theta.

data {
  int<lower=0> N;
  vector[N] x_scaled;
}

parameters {
  real mu;
  real<lower=0> sigma;
  real<lower = 0> inv_theta;  // I parameterize by the inverse of theta to
                              // avoid a nontrivial Jacobian adjustment
}

model {
  vector[N] x;
  x = x_scaled*inv_theta;
  inv_theta ~ ... // prior for inv_theta
  mu ~ ...
  sigma ~ ...
  x ~ normal(mu, sigma);
}

Notice that for any value of inv_theta, I can get a likelihood equivalent to the previous model by scaling both mu and sigma by inv_theta. The likelihood is completely agnostic about what value of inv_theta I pick. ALL of the information in the model about the correct value to choose for inv_theta is contained in the priors on mu, sigma, and inv_theta. The data have absolutely nothing to say about this choice. And so for this model, you would never expect to be able to infer the correct values of mu, sigma or inv_theta from data. Maybe something similar is going on in your model, but again it depends on the actual likelihood function that you’re using.

1 Like

I fully understand your example: if x ~ normal(mu, sigma) then this is equivalent to x ~ (1.0/inv_theta) *normal(inv_theta*mu, inv_theta*sigma). In my case, it is more complex since there is a one-to-one function g(.) such that g(sorted_vector_scaled_by_theta[i+1])-g(sorted_vector_scaled_by_theta[i])~exponential(.). I will check if there is a change of variables that produces the same likelihood(non identifiable problem.) Thanks.