About Jacobian adjustment on some complex models

I have a doubt about how to calculate the Jacobian adjustment for a change of variables. The problem is somewhat related wit this post: https://groups.google.com/g/stan-users/c/04GSu-ql3vM.

I have a vector or parameters deltas and parameters max_time_{1} , max_time_{2} ,… that are drawn
from a custom conditional distribution given the corresponding delta.

max_time_{i} \sim conditionalDensityTime(deltas_{i}, total_sample_size) .

And max_time_{1}< max_time_{2} <…(I can have several max_times)

Now I have an array of positive_ordered vector: inner_times. The inner_times[1] is an ordered vector of times between 0 and max_time_{1} *x[1]/x[2], the inner_times[2] is an ordered vector of times between
max_time_{1} *x[1]/x[2] and max_time_{2} *x[2]/x[2]= max_time_{2}.

I managed to do that using simplex vectors, one per each interval
max_time_{i-1} *x[i-1]/x[2] and max_time_{i}*x[i]/x[2].( following this post : https://groups.google.com/g/stan-users/c/04GSu-ql3vM.)

But in that case the limits of the intervals were constants and in my case, in my case are drawn from a distribution: conditionalDensityTime.

How should I include the Jacobian adjustment I this case?.( In particular for the extra terms I needed to add to padd
the positive ordered vectors inner_times to have the same length).

(The distribution for the inner time conditionalDensityInnerTimes also have a _rng function given: deltas, [max_time1, max_time2]’, max_times_in_oldest_population_scale,sample_sizes but it seems that _rng are just
for generated_quantities block and post analysis)

functions{
  vector padd_vector(vector ordered_vector1, int total_padded ){
     vector[total_padded] result;
     real max_value= ordered_vector1[rows(ordered_vector1)];
     int pos=rows(ordered_vector1)+1;
     result[1:rows(ordered_vector1)]= ordered_vector1;
     while(pos<=total_padded){
          result[pos]=max_value + pos;
          pos=pos+1;
      }
      return(result);
  }
real conditionalDensityTime_lpdf(vector delta, int total_sample_size){
    ...........

}
real conditionalDensityInnerTimes_lpdf(vector[] inner_times, vector deltas, vector max_times, vector       max_times_in_oldest_population_scale, vector x){

  ...........

}
//some other functions
}
data{
//some other data
  int<lower=2> total_sample_size;
  int<lower=0> sample_sizes[2];
  simplex[2] x;
}
parameters{
  vector<lower=0>[2] deltas;
  real<lower=0.0> max_time1;
 real<lower=(max_time1*x[1]/x[2])> max_time2;
 simplex[sample_sizes[1]] simplex1;
 simplex[sample_sizes[2]] simplex2;
//some other parameters
}
transformed parameters {
  positive_ordered[N]     times_in_oldest_population_scale;
 positive_ordered[total_sample_size-1] inner_times[2];

times_in_oldest_population_scale[1] = (x[1]/x[2])* max_time1;
times_in_oldest_population_scale[2] = max_time2;

inner_times[1]=padd_vector(head(cumulative_sum(simplex1), sample_sizes[1]) * max_time1,
                                                 total_sample_size-1) ;// add some increasing  values after the  first sample_sizes[1] 
//elements 
inner_times[2]=padd_vector(max_time1 +head(cumulative_sum(simplex2), sample_sizes[2]) * (max_time2-max_time1),
                                                 total_sample_size-1) ;// add some increasing  values after the first sample_sizes[2] 
//elements 
}
model {
  deltas ~ exponential(rep(1, 2));
  max_time1 ~conditionalDensityTime(deltas[1], total_sample_size);
 max_time2~ conditionalDensityTime(deltas[2], total_sample_size);
 inner_times~contionalDensityInnerTimes(deltas, [max_time1, max_time2]', times_in_oldest_population_scale, x);

//some other model statements
} 

1 Like

Hi,
sorry for not getting to you earlier, your question is relevant. The question is a bit out of my expertise, but I will give it a try since nobody else answered. Please double check my reasoning if it all makes sense to you.

Just to clarify (you probably already know this): you only need Jacobian adjustment when (very roughly speaking) a transformed value is on the left of a sampling statement. Stan takes care of all the Jacobians required by the parameters block. So we only need to care about the inner_times~contionalDensityInnerTimes(....) statement.

Further, we need the transformation to be 1-to-1, so that the Jacobian is a square matrix and it’s determinant is non-zero. This is where I think you’ll run into trouble, because it appears to me that inner_times do not have 1-to-1 mapping with the underlying parameters (which I assume are simplex1 and simplex2), but I might be wrong as I definitely don’t understand your model very well.

Best of luck with your model!

Hello Martin
Thanks for your comments.
There is indeed a one-to-one mapping between simplex1[1: (sample_sizes[1]-1) ]and inner_times[1][ 1: sample_sizes[1] ]:
inner_times[1,1] = simplex1[1]*max_time1
inner_times[1,2] = (simplex1[1]+simplex1[2])*max_time1
inner_times[1,3] = (simplex1[1]+simplex1[2]+simplex1[3])*max_time1

inner_times[1,sample_sizes[1]-1] = (simplex1[1]+simplex1[2]+simplex1[3])*max_time1

And inner_times follows distribution conditionalDensityInnerTimes ( not on the simplex vector simplex1).
The Jacobian matrix((sample_sizes[1]-1) x (sample_sizes[1]-1)) is simply:
J(inner_times[1:(sample_sizes[1]-1)]/ simplex1[1:(sample_sizes[1]-1)] = \bigg (
max_time1 0 0 … 0
max_time1 max_time1 0 0 … 0
max_time1 max_time1. max_time1 0 … 0

max_time1 max_time1 max_time1 … max_time1
\bigg )
log(det(J) )= (sample_sizes[1]-1)* log(max_time1)
In mi case max_time1 is also a parameter of the model(follows distribution
conditionalDensityTime conditional on deltas). Then my questions are:

1 - can I use J only for a subset of parameters?
2 - if we have 2 subset of parameters that need Jacobian adjustment(simplex1 and simplex2 to inner_times[1] and inner_times[2] respectively) with different cardinalities, can I still use a

 positive_ordered[total_sample_size-1] inner_times[2];

to store both ?(where I padd extra values in the positive_ordered vectors to have the same size for all positive_ordered vectors )

1 Like

So I think (with emphasis on fallibility of my understanding of this) that in this case, you should IMHO put a prior only on the first sample_sizes[1] - 1 elements of inner_time (plus the Jacobian adjustment).

If you would put a prior on all elements it is IMHO a bit like putting two priors on the same parameter and it might do weird things. Generally, it is often useful to think about how would you simulate data from the model - e.g. how would a random number generator for inner_times look like?

If you put prior only on the first elements then since the Jacobian is a triangular matrix, I think your adjustment (the log determinant) would be simply target += log(max_time1) * sample_sizes[1] - 1; (and the same for the second part of inner_times.

P.S. note that you can use triple backticks (```) to mark code blocks in your post (and single backticks for inline code)

2 Likes

The only reason I used the simplex vectors simplex1 and simplex2 is to force the constraint that inner_times[1][1:(sample_sizes[1]-1)] <max_time1 and inner_times[2][1:(sample_sizes[2]-1)]< max_time2 respectively. simplex1 and simplex2 are to just used to generate smaller values than max_time1 and max_time2.
I don’t have any prior information on simplex1 and simplex2. There is probability of the vectors
inner_times[1][1:(sample_sizes[1]-1)] and inner_times[2][1:(sample_sizes[2]-1)] given deltas[1], max_time1
and deltas[2], max_time2 respectively(distribution conditionalDensityInnerTimes
).

1 Like