Vectorizing prior sampling and normalizing prior samples after sampling

ma850 · October 2, 2024, 6:18pm

I want to normalize my parameters after sampling them from the prior distribution and before using them to perform posterior sampling. I do not need a second set of parameters that is normalized or anything. Still, rather I need to ensure that my parameters are (always) normalized as otherwise, they do not make sense. My code looks something like this:

data {
real zero;
}

transformed data {
  matrix[5, 3] n = [[ 0.7660444, zero, 0.7660444],
                    [-0.7660444, zero, 0.6427876],
                    [zero, 0.7660444, -0.6427876],
                    [zero, -0.6427876, -0.7660444],
                    [zero, 0.6427876, 0.6427876]];
  for (i in 1:5) {
    n[i, ] = n[i, ]./norm2(n[i, ]);
  }
}

parameters {
  matrix[5, 3] m;
}

model {
  for (i in 1:3){
    for (j in 1:5) {
      m[j, i] ~ uniform(m[j, i]-0.01, m[j, i]+0.01);
    }
  }

  for (i in 1:5) {
    m[i, ] = m[i, ]./norm2(m[i, ]);
  }
  target += sum(m.+n).^2;
}

The issue is that

  for (i in 1:5) {
    m[i, ] = m[i, ]./norm2(m[i, ]);
  }

doesn’t work, i.e. I cannot perform calculations like this in the model block. How can I solve this? I was considering using a transformed parameters block, but I don’t think that I understand it enough to implement it here. Specifically, I am not sure whether my normalized parameters would be used for posterior sampling.

The side issue that I have here is using a for-loop to sample from a prior distribution for a matrix parameter. Is it possible to do this more efficiently, such as some kind of vectorized prior sampling statement (it could be uniform or normal distribution)?

Bob_Carpenter · October 2, 2024, 10:06pm

The normalization in question converts a vector to unit length as follows:

This won’t work in Stan because you’re reassigning to a parameter and those get set on the outside by algorithms. Instead, you need to define a new variable m_norm and set it equal to the normalized vector.

This is going to be a problem for HMC in that it doesn’t identify the actual parameters, because (-100, 100) and (-1, 1) normalize to the same value, (-1/sqrt(2), 1/sqrt(2)). A better approach is to define m as an array of unit vectors,

array[5] unit_vector[3] m;

This will make sure m[k] is a unit vector of size 3 for k in 1:5.

I’m not sure what you’re trying to do here, because m shows up on the right and the left. Do you want to declare a new parameter that’s sampled very close to m? That would work in Stan’s language, though it’s an unusual approach (usually hard coded boundaries that are not physical are a bad idea in Bayesian models on both statistical and computational grounds). Assuming you declare a new parameter matrix[5, 3] almost_m, you can take

to_vector(almost_m) ~ uniform(m - 0.01, m + 0.01);

The real problem here is that this will only work if you’ve declared constraints on almost_m of the right bounds and we’re not quite set up to do that with matrices. What you’d really want is something like:

matrix<lower=m - 0.01, upper=m + 0.01>[5, 3] almost_m;

but I don’t think Stan supports that ( you can try). If not, you need to declare almost_m as an vector and then convert back to a matrix.

ma850 · October 4, 2024, 1:49pm

Thanks for the normalization response! I didn’t know about the unit_vector declaration. That is useful, however, since (in the real code) I need to perform rows_dot_product() operation on n and m during posterior sampling, would it be possible to enforce this while only using only linear algebra data types? Matrices work great here because I can pass them to rows_dot_product()and Stan figures out what to do. I was thinking just declaring 5 unit vectors separately, but that would require using for loops and posterior function would need to take more arguments (10, instead of 2), and I am concerned that would significantly decrease the computational efficiency. Is there any elegant way to both have unit vector constraints and perform linear algebra operations?

Also, just to confirm, subsetting array[5] unit_vector[3] m; works the same way as subsetting array[5, 3];, i.e. m[3, 3] is the third element of third unit vector (sorry I couldn’t fid this specific case in the docs)?

What I meant by m[j, i] ~ uniform(m[j, i]-0.01, m[j, i]+0.01); is that current iteration’s value of m is sampled from uniform distribution centered on previous iteration’s value of m. I hope my sampling statement achieves that, please correct me if I’m wrong!

What I wanted is to improve the efficiency by removing the for-loops. I have tried, similar to your suggestion, m ~ uniform(m - 0.01, m + 0.01);, but it tells me that arguments for uniform() must be reals. I was hoping there is something like a vectorized sampling statement as I believe that they might not be too niche to be implemented. I searched through the docs, but I could only find “regular” uniform and Gaussian sampling statements.

Bob_Carpenter · October 4, 2024, 6:48pm

Loops are super-fast in Stan if there’s no autodiff going on. So you can declare things this way:

array[5] unit_vector[3] m;

matrix[5, 3] mm;
for (k in 1:5) {
  mm[k] = m[k]';  // note that ' is used to transpose here to assign to a row vector
}

Now mm is a matrix with rows that are unit vectors. If you want columns that are unit vectors instead, that’d be:

matrix[3, 5] mm;
for (j in 1:5) {
  mm[ , j] = m[j];  // no transposition necessary to assign to column vector
}

With only 15 entries this is going to be super fast and won’t exert a lot of memory pressure.

This can’t be what you intend because m shows up on the left and right. What I was suggesting if you really really want a jittered version of m (I can’t see where that’d ever be useful, but there are a lot of models out there), then you need to declare a new parameter

parameters {
  real<lower=m - 0.01, upper=m + 0.01> m_jittered;  // lower and upper are critical here
...
model {
  m_jittered ~ uniform(m - 0.01, m + 0.01);
...

Then you need to take this up to vectors with either a loop or vectorization. If your initial m is a matrix, then you can do this:

parameters {
  vector[A * B] m_flat;
  vector<lower=m_flat - 0.01, upper=m_flat + 0.01>[A * B] m_flat_jitter;
  ...
transformed parameters {
  matrix[A, B] m = to_matrix(m_flat, A, B);
  matrix[A, B] m_jitter = to_matrix(m_flat_jitter, A, B);
  ...
model {
  m_flat_jitter ~ uniform(m_flat - 0.01, m_flat + 0.01);
  ...
}

Topic		Replies	Views
Prior on unit norm vectors General rstan , techniques , specification	1	273	November 21, 2023
Assign priors for parameters in a matrix General techniques	3	839	June 11, 2022
Vectorizing sampling of arrays of vectors Modeling	6	675	January 22, 2020
How to set regression parameters to have norm 1? Modeling specification	5	297	April 13, 2024
Efficiently sampling a matrix of independent parameters Modeling performance	6	900	June 15, 2018

Vectorizing prior sampling and normalizing prior samples after sampling

Related topics