RuntimeError: Something went wrong after call_sampler

Hi,
I am new to Stan, attempting to fit SBMs. I have successfully managed to do so for simple problems, but once I specify priors for some parameters the model fails. I am using pystan, you can see the model code below:

data {
  int<lower=1> K;
  int<lower=2> N;
  int<lower=0, upper=1> Adj[N,N];
  vector<lower=0>[K] alpha;
  vector<lower=0, upper=1>[K*(K+1)/2] pref_prob;
}

parameters {
  simplex[K] brk[N];
}

model {
  for (w in 1:N)
    brk[w] ~ dirichlet(alpha);
  
  for (u in 1:(N-1)) {
    for (v in (u+1):N) {
      real p = 0;
      for (i in 1:K) {
        for (j in 1:i) {
          p += brk[u][i]*brk[v][j]*pref_prob[(i*(i-1)/2)+j];
        }
      }
      for (i in 1:(K-1)) {
        for (j in (i+1):K) {
          p += brk[u][i]*brk[v][j]*pref_prob[(j*(j-1)/2)+i];
        }
      }
      
      Adj[u][v] ~ bernoulli(p);
    } 
  }
}
"""

The above model fails when I call pystan optimizing() to get the MAP, with the error below. However, when I remove the prior on brk above (the dirichlet distribution), it works fine. GCC version is 8.3.0.


RuntimeError Traceback (most recent call last)
in
4 #fit = sm3.optimizing(data=data3)
5 #fit = sm4.optimizing(data=data4, verbose=True)
----> 6 fit = sm5.optimizing(data=data5, verbose=True)

~/python/lib/python3.8/site-packages/pystan/model.py in optimizing(self, data, seed, init, sample_file, algorithm, verbose, as_vector, **kwargs)
579 stan_args = pystan.misc._get_valid_stan_args(stan_args)
580
–> 581 ret, sample = fit._call_sampler(stan_args)
582 pars = pystan.misc._par_vector2dict(sample[‘par’], m_pars, p_dims)
583 if not as_vector:

stanfit4anon_model_bcae1a414bf41039bdb99be5c39eead6_3996171161941223032.pyx in stanfit4anon_model_bcae1a414bf41039bdb99be5c39eead6_3996171161941223032.StanFit4Model._call_sampler()

stanfit4anon_model_bcae1a414bf41039bdb99be5c39eead6_3996171161941223032.pyx in stanfit4anon_model_bcae1a414bf41039bdb99be5c39eead6_3996171161941223032._call_sampler()

RuntimeError: Something went wrong after call_sampler.

It’s the lines here:

for (w in 1:N)
    brk[w] ~ dirichlet(alpha);

Right?

Can you save the input you pass to PyStan as a .json and post it here? Then I can take your model and try to reproduce the problem.

Yes, that’s it. Here is the full code, I also attached the data dictionary in a json.

import pystan
import igraph
import collections
import numpy as np
import json

model = """
data {
  int<lower=1> K;
  int<lower=2> N;
  int<lower=0, upper=1> Adj[N,N];
  vector<lower=0>[K] alpha;
  vector<lower=0, upper=1>[K*(K+1)/2] pref_prob;
}

parameters {
  simplex[K] brk[N];
}

model {
  for (w in 1:N)
    brk[w] ~ dirichlet(alpha);
  
  for (u in 1:(N-1)) {
    for (v in (u+1):N) {
      real p = 0;
      for (i in 1:K) {
        for (j in 1:i) {
          p += brk[u][i]*brk[v][j]*pref_prob[(i*(i-1)/2)+j];
        }
      }
      for (i in 1:(K-1)) {
        for (j in (i+1):K) {
          p += brk[u][i]*brk[v][j]*pref_prob[(j*(j-1)/2)+i];
        }
      }
      
      Adj[u][v] ~ bernoulli(p);
    } 
  }
}
"""

block_sizes = [30, 30]
pref_matrix=[[0.3,0.05], [0.05,0.6]]
network = igraph.Graph.SBM(n=np.sum(block_sizes),
                           pref_matrix=pref_matrix,
                           block_sizes=block_sizes,
                           directed=False, loops=False)
types = [1, 2]
type_list = []
for i, block_size in enumerate(block_sizes):
  type_list.extend([types[i]]*block_size)
network.vs['type'] = types
print(collections.Counter(network.vs['type']))

sm = pystan.StanModel(model_code=model)
data = {'K': len(set(network.vs['type'])),
        'N': len(network.vs),
        'Adj': network.get_adjacency().data,
        'pref_prob': [pref_matrix[0][0], pref_matrix[0][1], pref_matrix[1][1]],
        'alpha': [0.9,6]
       }
fit = sm.optimizing(data=data)

It seems the fit fails only for a specific set of alpha parameters. For example [0.9, 6] fails but [1,6] does not.
data.txt (10.7 KB)

Is the data file you attached one for which it fails? This is running for me.

My script is something like:

library(cmdstanr)

model = cmdstan_model("run.stan")

fit = model$sample(data = "data.json", chains = 4, cores = 4)

If it’s not working, maybe try the cmdstanpy interface and see if that works.

Optimization can fail because a global maximum does not always exist or it is at a boundary. This is something to consider when using Dirichlet priors. The Dirichlet distribution is zero-avoiding when alpha > 1, flat when alpha=1 and has a infinitely sharp peak at zero when alpha < 1. Your model works when alpha > 1 and there are several values in the brk array whose optimum behaves like this:

brk-opt

The optimization sometimes works (or at least returns a result) with alpha=0.9999 but often fails with something about non-finite gradient. The MAP estimate apparently goes to zero when alpha<1 but the Dirichlet prior has infinite density at zero so the optimizer fails.

Note that sampling still works even if the maximum doesn’t exist.

Yes, it does seem to be related to the gradient at zero. I think this solves the issue and I should avoid these priors. However, I am still confused why it works in R as bbbales2 suggests.

On sampling, you are correct that sampling does not fail. But I did try that and it looked like it never finished, even for 100 samples.

Note that @bbbales2’s code says model$sample. I doubt optimizing would have worked.

My laptop running this

with open("data.txt") as f:
    data=json.load(f)
data['alpha'] = [0.9,6]
fit = model.sampling(data=data)
print(fit)

Stan reports

 Elapsed Time: 44.7768 seconds (Warm-up)
               44.8073 seconds (Sampling)
               89.5841 seconds (Total)

Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 47.0158 seconds (Warm-up)
               44.654 seconds (Sampling)
               91.6698 seconds (Total)

Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 46.9109 seconds (Warm-up)
               44.1761 seconds (Sampling)
               91.087 seconds (Total)

Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 47.7848 seconds (Warm-up)
               43.9007 seconds (Sampling)
               91.6856 seconds (Total)

and the printed results are

Inference for Stan model: anon_model_1403a019614079f6ac040b8ba87acbcd.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

            mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
brk[1,1]    0.62  1.5e-3   0.13   0.35   0.54   0.63   0.71   0.83   7088    1.0
brk[2,1]    0.53  1.6e-3   0.14   0.23   0.44   0.54   0.63   0.78   7827    1.0
brk[3,1]    0.56  1.5e-3   0.13   0.27   0.47   0.57   0.65   0.78   7343    1.0
brk[4,1]    0.51  1.6e-3   0.14   0.22   0.42   0.52   0.61   0.76   7216    1.0
brk[5,1]    0.66  1.3e-3   0.12    0.4   0.58   0.67   0.75   0.86   7996    1.0
brk[6,1]    0.56  1.6e-3   0.14   0.27   0.47   0.57   0.65    0.8   6942    1.0
brk[7,1]    0.66  1.2e-3   0.12   0.39   0.58   0.67   0.74   0.86   9616    1.0
brk[8,1]    0.56  1.5e-3   0.14   0.28   0.47   0.57   0.65   0.79   8343    1.0
brk[9,1]    0.62  1.4e-3   0.13   0.35   0.54   0.63   0.71   0.84   8284    1.0
brk[10,1]   0.58  1.8e-3   0.14   0.29    0.5   0.59   0.68   0.82   5972    1.0
brk[11,1]   0.69  1.3e-3   0.11   0.44   0.62    0.7   0.77   0.88   7290    1.0
brk[12,1]   0.55  1.6e-3   0.14   0.25   0.47   0.56   0.66    0.8   7345    1.0
brk[13,1]   0.54  1.5e-3   0.13   0.25   0.46   0.55   0.63   0.78   8338    1.0
brk[14,1]   0.57  1.5e-3   0.13   0.28   0.48   0.58   0.66    0.8   7591    1.0
brk[15,1]   0.63  1.3e-3   0.13   0.35   0.55   0.64   0.72   0.85   9096    1.0
brk[16,1]   0.48  1.9e-3   0.14   0.17   0.39   0.49   0.58   0.73   5522    1.0
brk[17,1]   0.65  1.4e-3   0.12   0.39   0.57   0.66   0.73   0.85   7401    1.0
brk[18,1]    0.6  1.6e-3   0.13   0.31   0.52   0.61   0.69   0.83   7077    1.0
brk[19,1]   0.57  1.4e-3   0.13   0.29   0.48   0.58   0.66   0.79   7971    1.0
brk[20,1]   0.65  1.3e-3   0.12   0.38   0.57   0.66   0.73   0.85   8833    1.0
brk[21,1]   0.63  1.4e-3   0.13   0.35   0.55   0.64   0.72   0.84   8331    1.0
brk[22,1]   0.63  1.3e-3   0.13   0.34   0.55   0.64   0.72   0.85   9614    1.0
brk[23,1]   0.65  1.4e-3   0.12   0.38   0.57   0.66   0.74   0.86   8293    1.0
brk[24,1]   0.65  1.3e-3   0.12   0.39   0.58   0.67   0.74   0.86   8908    1.0
brk[25,1]   0.58  1.5e-3   0.13   0.29   0.49   0.59   0.68   0.81   7523    1.0
brk[26,1]   0.68  1.3e-3   0.11   0.44   0.61   0.69   0.76   0.87   7495    1.0
brk[27,1]   0.49  2.2e-3   0.15   0.18   0.41    0.5    0.6   0.76   4586    1.0
brk[28,1]   0.57  1.6e-3   0.13   0.28   0.49   0.57   0.66    0.8   7067    1.0
brk[29,1]   0.58  1.7e-3   0.14   0.26   0.49   0.59   0.68   0.82   6552    1.0
brk[30,1]   0.33  2.2e-3   0.15   0.04   0.22   0.32   0.43   0.61   4404    1.0
brk[31,1]   0.07  7.6e-4   0.07 1.4e-3   0.02   0.05    0.1   0.24   7422    1.0
brk[32,1]   0.09  9.4e-4   0.08 2.6e-3   0.03   0.07   0.13   0.29   6777    1.0
brk[33,1]   0.06  6.8e-4   0.06 1.1e-3   0.02   0.04   0.09   0.22   7393    1.0
brk[34,1]   0.05  5.9e-4   0.05 1.0e-3   0.01   0.04   0.07   0.18   6869    1.0
brk[35,1]   0.05  5.7e-4   0.05 8.1e-4   0.01   0.03   0.07   0.17   6429    1.0
brk[36,1]   0.07  7.6e-4   0.07 1.7e-3   0.02   0.05    0.1   0.24   7693    1.0
brk[37,1]   0.09  8.9e-4   0.08 2.0e-3   0.03   0.06   0.13   0.28   7588    1.0
brk[38,1]   0.06  6.9e-4   0.06 1.6e-3   0.02   0.05   0.09   0.21   6888    1.0
brk[39,1]   0.13  1.3e-3    0.1 4.0e-3   0.05   0.11    0.2   0.38   5906    1.0
brk[40,1]   0.06  6.6e-4   0.06 1.3e-3   0.02   0.04   0.09   0.22   7508    1.0
brk[41,1]   0.06  7.0e-4   0.06 1.2e-3   0.02   0.05   0.09   0.22   7334    1.0
brk[42,1]    0.1  9.9e-4   0.08 2.6e-3   0.03   0.08   0.15   0.31   7208    1.0
brk[43,1]   0.07  7.7e-4   0.06 1.5e-3   0.02   0.05    0.1   0.23   6766    1.0
brk[44,1]   0.11  1.1e-3   0.09 2.8e-3   0.04   0.09   0.17   0.35   7210    1.0
brk[45,1]   0.08  8.4e-4   0.07 1.3e-3   0.02   0.05   0.11   0.26   7043    1.0
brk[46,1]    0.1  9.9e-4   0.08 2.8e-3   0.04   0.08   0.15   0.31   7339    1.0
brk[47,1]   0.09  9.0e-4   0.08 2.4e-3   0.03   0.07   0.13   0.29   7548    1.0
brk[48,1]    0.1  9.7e-4   0.08 2.2e-3   0.03   0.08   0.14   0.31   7581    1.0
brk[49,1]   0.09  8.8e-4   0.08 1.7e-3   0.03   0.07   0.12   0.28   7252    1.0
brk[50,1]   0.07  7.7e-4   0.06 1.4e-3   0.02   0.05    0.1   0.23   6767    1.0
brk[51,1]   0.06  7.2e-4   0.06 1.0e-3   0.02   0.04   0.08    0.2   5972    1.0
brk[52,1]   0.16  1.4e-3   0.12 5.3e-3   0.07   0.14   0.23   0.42   6433    1.0
brk[53,1]   0.07  7.2e-4   0.07 1.8e-3   0.02   0.06   0.11   0.24   8163    1.0
brk[54,1]   0.07  7.4e-4   0.06 1.8e-3   0.02   0.05   0.09   0.23   6724    1.0
brk[55,1]   0.06  6.3e-4   0.05 1.6e-3   0.02   0.04   0.08   0.19   6599    1.0
brk[56,1]   0.07  8.8e-4   0.07 1.2e-3   0.02   0.05   0.11   0.25   6228    1.0
brk[57,1]    0.1  9.9e-4   0.08 2.6e-3   0.03   0.08   0.14    0.3   6775    1.0
brk[58,1]   0.09  9.3e-4   0.08 2.4e-3   0.03   0.07   0.13   0.29   7067    1.0
brk[59,1]   0.11  1.2e-3   0.09 2.8e-3   0.04   0.09   0.17   0.34   6286    1.0
brk[60,1]   0.07  8.4e-4   0.07 1.5e-3   0.02   0.05   0.11   0.25   6371    1.0
brk[1,2]    0.38  1.5e-3   0.13   0.17   0.29   0.37   0.46   0.65   7088    1.0
brk[2,2]    0.47  1.6e-3   0.14   0.22   0.37   0.46   0.56   0.77   7827    1.0
brk[3,2]    0.44  1.5e-3   0.13   0.22   0.35   0.43   0.53   0.73   7343    1.0
brk[4,2]    0.49  1.6e-3   0.14   0.24   0.39   0.48   0.58   0.78   7216    1.0
brk[5,2]    0.34  1.3e-3   0.12   0.14   0.25   0.33   0.42    0.6   7996    1.0
brk[6,2]    0.44  1.6e-3   0.14    0.2   0.35   0.43   0.53   0.73   6942    1.0
brk[7,2]    0.34  1.2e-3   0.12   0.14   0.26   0.33   0.42   0.61   9616    1.0
brk[8,2]    0.44  1.5e-3   0.14   0.21   0.35   0.43   0.53   0.72   8343    1.0
brk[9,2]    0.38  1.4e-3   0.13   0.16   0.29   0.37   0.46   0.65   8284    1.0
brk[10,2]   0.42  1.8e-3   0.14   0.18   0.32   0.41    0.5   0.71   5972    1.0
brk[11,2]   0.31  1.3e-3   0.11   0.12   0.23    0.3   0.38   0.56   7290    1.0
brk[12,2]   0.45  1.6e-3   0.14    0.2   0.34   0.44   0.53   0.75   7345    1.0
brk[13,2]   0.46  1.5e-3   0.13   0.22   0.37   0.45   0.54   0.75   8338    1.0
brk[14,2]   0.43  1.5e-3   0.13    0.2   0.34   0.42   0.52   0.72   7591    1.0
brk[15,2]   0.37  1.3e-3   0.13   0.15   0.28   0.36   0.45   0.65   9096    1.0
brk[16,2]   0.52  1.9e-3   0.14   0.27   0.42   0.51   0.61   0.83   5522    1.0
brk[17,2]   0.35  1.4e-3   0.12   0.15   0.27   0.34   0.43   0.61   7401    1.0
brk[18,2]    0.4  1.6e-3   0.13   0.17   0.31   0.39   0.48   0.69   7077    1.0
brk[19,2]   0.43  1.4e-3   0.13   0.21   0.34   0.42   0.52   0.71   7971    1.0
brk[20,2]   0.35  1.3e-3   0.12   0.15   0.27   0.34   0.43   0.62   8833    1.0
brk[21,2]   0.37  1.4e-3   0.13   0.16   0.28   0.36   0.45   0.65   8331    1.0
brk[22,2]   0.37  1.3e-3   0.13   0.15   0.28   0.36   0.45   0.66   9614    1.0
brk[23,2]   0.35  1.4e-3   0.12   0.14   0.26   0.34   0.43   0.62   8293    1.0
brk[24,2]   0.35  1.3e-3   0.12   0.14   0.26   0.33   0.42   0.61   8908    1.0
brk[25,2]   0.42  1.5e-3   0.13   0.19   0.32   0.41   0.51   0.71   7523    1.0
brk[26,2]   0.32  1.3e-3   0.11   0.13   0.24   0.31   0.39   0.56   7495    1.0
brk[27,2]   0.51  2.2e-3   0.15   0.24    0.4    0.5   0.59   0.82   4586    1.0
brk[28,2]   0.43  1.6e-3   0.13    0.2   0.34   0.43   0.51   0.72   7067    1.0
brk[29,2]   0.42  1.7e-3   0.14   0.18   0.32   0.41   0.51   0.74   6552    1.0
brk[30,2]   0.67  2.2e-3   0.15   0.39   0.57   0.68   0.78   0.96   4404    1.0
brk[31,2]   0.93  7.6e-4   0.07   0.76    0.9   0.95   0.98    1.0   7422    1.0
brk[32,2]   0.91  9.4e-4   0.08   0.71   0.87   0.93   0.97    1.0   6777    1.0
brk[33,2]   0.94  6.8e-4   0.06   0.78   0.91   0.96   0.98    1.0   7393    1.0
brk[34,2]   0.95  5.9e-4   0.05   0.82   0.93   0.96   0.99    1.0   6869    1.0
brk[35,2]   0.95  5.7e-4   0.05   0.83   0.93   0.97   0.99    1.0   6429    1.0
brk[36,2]   0.93  7.6e-4   0.07   0.76    0.9   0.95   0.98    1.0   7693    1.0
brk[37,2]   0.91  8.9e-4   0.08   0.72   0.87   0.94   0.97    1.0   7588    1.0
brk[38,2]   0.94  6.9e-4   0.06   0.79   0.91   0.95   0.98    1.0   6888    1.0
brk[39,2]   0.87  1.3e-3    0.1   0.62    0.8   0.89   0.95    1.0   5906    1.0
brk[40,2]   0.94  6.6e-4   0.06   0.78   0.91   0.96   0.98    1.0   7508    1.0
brk[41,2]   0.94  7.0e-4   0.06   0.78   0.91   0.95   0.98    1.0   7334    1.0
brk[42,2]    0.9  9.9e-4   0.08   0.69   0.85   0.92   0.97    1.0   7208    1.0
brk[43,2]   0.93  7.7e-4   0.06   0.77    0.9   0.95   0.98    1.0   6766    1.0
brk[44,2]   0.89  1.1e-3   0.09   0.65   0.83   0.91   0.96    1.0   7210    1.0
brk[45,2]   0.92  8.4e-4   0.07   0.74   0.89   0.95   0.98    1.0   7043    1.0
brk[46,2]    0.9  9.9e-4   0.08   0.69   0.85   0.92   0.96    1.0   7339    1.0
brk[47,2]   0.91  9.0e-4   0.08   0.71   0.87   0.93   0.97    1.0   7548    1.0
brk[48,2]    0.9  9.7e-4   0.08   0.69   0.86   0.92   0.97    1.0   7581    1.0
brk[49,2]   0.91  8.8e-4   0.08   0.72   0.88   0.93   0.97    1.0   7252    1.0
brk[50,2]   0.93  7.7e-4   0.06   0.77    0.9   0.95   0.98    1.0   6767    1.0
brk[51,2]   0.94  7.2e-4   0.06    0.8   0.92   0.96   0.98    1.0   5972    1.0
brk[52,2]   0.84  1.4e-3   0.12   0.58   0.77   0.86   0.93   0.99   6433    1.0
brk[53,2]   0.93  7.2e-4   0.07   0.76   0.89   0.94   0.98    1.0   8163    1.0
brk[54,2]   0.93  7.4e-4   0.06   0.77   0.91   0.95   0.98    1.0   6724    1.0
brk[55,2]   0.94  6.3e-4   0.05   0.81   0.92   0.96   0.98    1.0   6599    1.0
brk[56,2]   0.93  8.8e-4   0.07   0.75   0.89   0.95   0.98    1.0   6228    1.0
brk[57,2]    0.9  9.9e-4   0.08    0.7   0.86   0.92   0.97    1.0   6775    1.0
brk[58,2]   0.91  9.3e-4   0.08   0.71   0.87   0.93   0.97    1.0   7067    1.0
brk[59,2]   0.89  1.2e-3   0.09   0.66   0.83   0.91   0.96    1.0   6286    1.0
brk[60,2]   0.93  8.4e-4   0.07   0.75   0.89   0.95   0.98    1.0   6371    1.0
lp__       -1207    0.15   5.97  -1220  -1211  -1207  -1203  -1197   1604    1.0

Samples were drawn using NUTS at Wed Jun  3 18:52:56 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

You are right, it does finish with [0.9, 6]. But it seems it does not for [0.1,6], at least for the time I let it run (5000 samples). Again, I think this has to do with the gradient being too sharp for [0.1, 6]. I think this again proves the point that if there is a computational problem, there is a modeling problem. This is happening because the choice of prior is inappropriate. Thanks for help