Better way of modelling stick-breaking process?

Hello,

I just started learning Stan so forgive me if this is asinine but I wrote the following program to model a stick-breaking process:

data {
	int<lower=0> N;
	int<lower=0> M;
	real<lower=0, upper=1> y[N,M];
}
parameters {
	real<lower=0> a;
	real<lower=0, upper=1> b[N,M];
}
transformed parameters {
	real<lower=0, upper=1> B[N,M];
	B = b;
	for (i in 1:N)
		for (j in 1:M)
			for (k in 1:j-1)
				B[i,j] *= 1 - b[i,k];
}
model {
	for (i in 1:N)
		b[i] ~ beta(1, a);
	for (i in 1:N)
		for (j in 1:M)
			y[i,j] - B[i,j] ~ normal(0, 0.001);
}

Originally I simply wanted to write y = B; in the model section but I realized that wasn’t allowed so I instead constrained their difference with a very tight normal. It works pretty well but I can’t help but feel this is some kind of hack.

I’m also getting the following warning from the compiler, further contributing to my sense of unease:

Warning (non-fatal):
Left-hand side of sampling statement (~) may contain a non-linear transform of a parameter or local variable.
If it does, you need to include a target += statement with the log absolute determinant of the Jacobian of the transform.
Left-hand-side of sampling statement:
    (y[i, j] - B[i, j]) ~ normal(...)

Here is a link to the Mathematica notebook file with all of my results and reports from the sampler:
https://drive.google.com/open?id=1kasphL57WAFAwIikaH-C4SCaiQNMC81E

Thanks for any input.
-Marcos

Why do you want to model a stick-breaking process? It’s built into our simplex data type when you do this:

simplex[K] theta;

That makes sure that theta[k] > 0 for all k in 1:K and makes sure sum(theta) = 1. It works under the hood as a stick-breaking process as outlined in the manual.

Sorry nobody answered this before, but you’re absolutely right. You do not want to be using this kind of hack.

As it should be. You have a non-linear transform on the left-hand side of a sampling statement, which is exactly what this warning is supposed to be doing!

You can’t assign to data. So I’m not even sure what you’re trying to do here.

You can vectorize your b sampling to just b ~ beta(1, a); and same for normal, but you don’t want to do that normal trick.

What you want is something like this to draw the breaks from a beta distribution:

parameters {
  vector<lower=0, upper=1>[N - 1] breaks;
  ...
transformed parameters {
  simplex[N] b;
  {
    real sum = 0;
    b[1] = breaks[1];
    sum = b[1];
    for (n in 2:(N - 1)) {
      b[n] = (1 - sum) * breaks[n];
      sum += b[n];
    }
    b[N] = 1 - sum;
  }
  ...
model {
  breaks ~ beta(1, a);

I didn’t test that, but it’s the right way to lay the code out for stick-breaking, with the result being written into the simplex b. You can just loop if you need to break more sticks.

1 Like

Hey thanks for the reply.

Why do you want to model the stick-breaking process?

Mostly just as an academic exercise in trying to write a program that would take a set of N samples from one truncated at length M and estimate the parameter used to generate them. I wasn’t aware of the simplex type but I will definitely read up on it.

You can’t assign to data. So I’m not even sure what you’re trying to do here.

Okay, I see that now, my bad. My basic understanding of Bayesian inference is that one comes up with a story about how their data was generated and then creates a model to replicate that, so I just kind of wrote how one gets to the broken sticks from a list of beta random variables (since that’s how I generated them) and wanted to “explain” the input data by saying “y = B”.

I’ll study the code you’ve posted, but I’d also like to ask about a revised version that I wrote, which is as follows:

data {
    int<lower=0> N;
    int<lower=0> M;
    real<lower=0, upper=1> y[N,M];
}
parameters {
    real<lower=0> a;
}
transformed parameters {
    real<lower=0, upper=1> y_[N,M];
    y_ = y;
    for (i in 1:N)
        for (j in 1:M)
            for (k in 1:j-1)
                y_[i,j] /= 1 - y_[i,k];
}
model {
    for (i in 1:N)
        y_[i] ~ beta(1, a);
}

This kind of explains the input data in reverse. working from the broken sticks back to the list of beta distributed RVs. This seems to do the job without any hacks and runs much more quickly. However, I’m still getting a warning about non-linear transformations for the line “y_[i] ~ beta(1, a);”, so is this still improper?

Thanks again for the reply.

I don’t understand this. the y_ should be defined as transformed data as they don’t involve the parameter a. Also, you can vectorize,

y_ ~ beta(1, a);
1 Like

Ah okay, I moved it to the transformed data block. No more compiler warnings! Thanks a lot.

Also when I try to vectorize I get the following:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

No matches for: 

  real[,] ~ beta(int, real)

y and y_ are matrices that hold multiple samples from the SBP and corresponding reconstructed beta variables, respectively. I guess that can’t be vectorized.

Thanks again.

If it says real[,], you’re dealing with a 2D array, not a matrix. With a 2D array, you can use to_array_1d() to convert it to a 1D array, which you can use on the left-hand side. (It’s more efficient to do this conversion than to loop because we can compress the expression graph for autodiff if we get it all at once.)