Stochastic Block Model

Hi. I am new to stan and I am trying to implement a Stochastic Block Model as specified below:

Bayesian Stochastic Block models Note.pdf (118.8 KB)

I am trying to sample from the density given by the last expression on page 2. Here is my stan code:

// The input data is a vector 'y' of length 'N'.
data {
  int<lower=1> k; // number of categories or clusters
  int<lower=2> n; // number of vertices (nodes) in the network 
  int<lower=0, upper=1> A[n,n]; // the sample graph (network)
  vector<lower=1>[k] n_k; // number nodes that fall into a category ***Note*** n_k = n_r
  vector<lower=0>[k] alpha; // a vector of alphas that parameterize the Dirichlet distribution
  int<lower=1> z[n]; // class or cluster assignments 
  matrix<lower=0>[k,k] n_rs; // Constant used in the Gibbs sampling
  matrix<lower=0>[k,k] A_rs; // Constant used in the Gibbs sampling
}


parameters {
  simplex[k] p[n]; // membership probability per vertex
}

model {
  matrix[k,k] Qrs;
  for(r in 1:k){
    for(s in 1:k){
      Qrs[r,s] = beta(1 + A_rs[r,s],1 + n_rs[r,s] - A_rs[r,s]); 
    }
  }
  
  for(i in 1:n){
    for(j in 1:n){
      target += A[i,j]*log(Qrs[z[i],z[j]] + .01) + (1-A[i,j])*log(1-Qrs[z[i],z[j]] + .01);
    }
    for(l in 1:n){
      target += A[l,i]*log(Qrs[z[l],z[i]] + .01) + (1-A[l,i])*log(1-Qrs[z[l],z[i]] + .01);
    }
  }
}

I know that the following lines are not allowed in the model block because the _rng functions cannot be used in the model block, but how do I generate the beta values then?

  for(r in 1:k){
    for(s in 1:k){
      Qrs[r,s] = beta_rng(1 + A_rs[r,s],1 + n_rs[r,s] - A_rs[r,s]); 
    }
  }

The key problem is that I need to first draw from a beta distribution and then use the result of the random draws in the update. How do I do this in stan?

Welcome. One of the main things to recognize is that the MCMC algorithms in Stan are not Gibbs samplers, so you just have to specify the (log) kernel of the joint distribution of the parameters and the data (rather than any full-conditional distributions). That corresponds to equations 1 – 4 of the PDF you linked to.

In Stan code, I think that would be

data {
  ...
}
parameters {
  simplex[k] p[n]; // membership probability per vertex
  matrix[k, k] Q;  // has a implicit standard uniform prior
}

but this model seems have a discrete unknown z, which you would need to marginalize out in order to do this with Stan. See the chapter in the user guide on Hidden Markov Models.

2 Likes