RuntimeError: Something went wrong after call_sampler

Eaman · June 2, 2020, 5:14pm

Hi,
I am new to Stan, attempting to fit SBMs. I have successfully managed to do so for simple problems, but once I specify priors for some parameters the model fails. I am using pystan, you can see the model code below:

data {
  int<lower=1> K;
  int<lower=2> N;
  int<lower=0, upper=1> Adj[N,N];
  vector<lower=0>[K] alpha;
  vector<lower=0, upper=1>[K*(K+1)/2] pref_prob;
}

parameters {
  simplex[K] brk[N];
}

model {
  for (w in 1:N)
    brk[w] ~ dirichlet(alpha);
  
  for (u in 1:(N-1)) {
    for (v in (u+1):N) {
      real p = 0;
      for (i in 1:K) {
        for (j in 1:i) {
          p += brk[u][i]*brk[v][j]*pref_prob[(i*(i-1)/2)+j];
        }
      }
      for (i in 1:(K-1)) {
        for (j in (i+1):K) {
          p += brk[u][i]*brk[v][j]*pref_prob[(j*(j-1)/2)+i];
        }
      }
      
      Adj[u][v] ~ bernoulli(p);
    } 
  }
}
"""

The above model fails when I call pystan optimizing() to get the MAP, with the error below. However, when I remove the prior on brk above (the dirichlet distribution), it works fine. GCC version is 8.3.0.

RuntimeError Traceback (most recent call last)
in
4 #fit = sm3.optimizing(data=data3)
5 #fit = sm4.optimizing(data=data4, verbose=True)
----> 6 fit = sm5.optimizing(data=data5, verbose=True)

~/python/lib/python3.8/site-packages/pystan/model.py in optimizing(self, data, seed, init, sample_file, algorithm, verbose, as_vector, **kwargs)
579 stan_args = pystan.misc._get_valid_stan_args(stan_args)
580
–> 581 ret, sample = fit._call_sampler(stan_args)
582 pars = pystan.misc._par_vector2dict(sample[‘par’], m_pars, p_dims)
583 if not as_vector:

stanfit4anon_model_bcae1a414bf41039bdb99be5c39eead6_3996171161941223032.pyx in stanfit4anon_model_bcae1a414bf41039bdb99be5c39eead6_3996171161941223032.StanFit4Model._call_sampler()

stanfit4anon_model_bcae1a414bf41039bdb99be5c39eead6_3996171161941223032.pyx in stanfit4anon_model_bcae1a414bf41039bdb99be5c39eead6_3996171161941223032._call_sampler()

RuntimeError: Something went wrong after call_sampler.

bbbales2 · June 3, 2020, 1:00am

It’s the lines here:

for (w in 1:N)
    brk[w] ~ dirichlet(alpha);

Right?

Can you save the input you pass to PyStan as a .json and post it here? Then I can take your model and try to reproduce the problem.

Eaman · June 3, 2020, 3:38am

Yes, that’s it. Here is the full code, I also attached the data dictionary in a json.

import pystan
import igraph
import collections
import numpy as np
import json

model = """
data {
  int<lower=1> K;
  int<lower=2> N;
  int<lower=0, upper=1> Adj[N,N];
  vector<lower=0>[K] alpha;
  vector<lower=0, upper=1>[K*(K+1)/2] pref_prob;
}

parameters {
  simplex[K] brk[N];
}

model {
  for (w in 1:N)
    brk[w] ~ dirichlet(alpha);
  
  for (u in 1:(N-1)) {
    for (v in (u+1):N) {
      real p = 0;
      for (i in 1:K) {
        for (j in 1:i) {
          p += brk[u][i]*brk[v][j]*pref_prob[(i*(i-1)/2)+j];
        }
      }
      for (i in 1:(K-1)) {
        for (j in (i+1):K) {
          p += brk[u][i]*brk[v][j]*pref_prob[(j*(j-1)/2)+i];
        }
      }
      
      Adj[u][v] ~ bernoulli(p);
    } 
  }
}
"""

block_sizes = [30, 30]
pref_matrix=[[0.3,0.05], [0.05,0.6]]
network = igraph.Graph.SBM(n=np.sum(block_sizes),
                           pref_matrix=pref_matrix,
                           block_sizes=block_sizes,
                           directed=False, loops=False)
types = [1, 2]
type_list = []
for i, block_size in enumerate(block_sizes):
  type_list.extend([types[i]]*block_size)
network.vs['type'] = types
print(collections.Counter(network.vs['type']))

sm = pystan.StanModel(model_code=model)
data = {'K': len(set(network.vs['type'])),
        'N': len(network.vs),
        'Adj': network.get_adjacency().data,
        'pref_prob': [pref_matrix[0][0], pref_matrix[0][1], pref_matrix[1][1]],
        'alpha': [0.9,6]
       }
fit = sm.optimizing(data=data)

It seems the fit fails only for a specific set of alpha parameters. For example [0.9, 6] fails but [1,6] does not.
data.txt (10.7 KB)

bbbales2 · June 3, 2020, 1:56pm

Is the data file you attached one for which it fails? This is running for me.

My script is something like:

library(cmdstanr)

model = cmdstan_model("run.stan")

fit = model$sample(data = "data.json", chains = 4, cores = 4)

If it’s not working, maybe try the cmdstanpy interface and see if that works.

nhuurre · June 3, 2020, 3:12pm

Optimization can fail because a global maximum does not always exist or it is at a boundary. This is something to consider when using Dirichlet priors. The Dirichlet distribution is zero-avoiding when alpha > 1, flat when alpha=1 and has a infinitely sharp peak at zero when alpha < 1. Your model works when alpha > 1 and there are several values in the brk array whose optimum behaves like this:

brk-opt

The optimization sometimes works (or at least returns a result) with alpha=0.9999 but often fails with something about non-finite gradient. The MAP estimate apparently goes to zero when alpha<1 but the Dirichlet prior has infinite density at zero so the optimizer fails.

Note that sampling still works even if the maximum doesn’t exist.

Eaman · June 3, 2020, 3:43pm

Yes, it does seem to be related to the gradient at zero. I think this solves the issue and I should avoid these priors. However, I am still confused why it works in R as bbbales2 suggests.

On sampling, you are correct that sampling does not fail. But I did try that and it looked like it never finished, even for 100 samples.

nhuurre · June 3, 2020, 3:56pm

Note that @bbbales2’s code says model$sample. I doubt optimizing would have worked.

My laptop running this

with open("data.txt") as f:
    data=json.load(f)
data['alpha'] = [0.9,6]
fit = model.sampling(data=data)
print(fit)

Stan reports

 Elapsed Time: 44.7768 seconds (Warm-up)
               44.8073 seconds (Sampling)
               89.5841 seconds (Total)

Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 47.0158 seconds (Warm-up)
               44.654 seconds (Sampling)
               91.6698 seconds (Total)

Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 46.9109 seconds (Warm-up)
               44.1761 seconds (Sampling)
               91.087 seconds (Total)

Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 47.7848 seconds (Warm-up)
               43.9007 seconds (Sampling)
               91.6856 seconds (Total)

and the printed results are

Inference for Stan model: anon_model_1403a019614079f6ac040b8ba87acbcd.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

            mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
brk[1,1]    0.62  1.5e-3   0.13   0.35   0.54   0.63   0.71   0.83   7088    1.0
brk[2,1]    0.53  1.6e-3   0.14   0.23   0.44   0.54   0.63   0.78   7827    1.0
brk[3,1]    0.56  1.5e-3   0.13   0.27   0.47   0.57   0.65   0.78   7343    1.0
brk[4,1]    0.51  1.6e-3   0.14   0.22   0.42   0.52   0.61   0.76   7216    1.0
brk[5,1]    0.66  1.3e-3   0.12    0.4   0.58   0.67   0.75   0.86   7996    1.0
brk[6,1]    0.56  1.6e-3   0.14   0.27   0.47   0.57   0.65    0.8   6942    1.0
brk[7,1]    0.66  1.2e-3   0.12   0.39   0.58   0.67   0.74   0.86   9616    1.0
brk[8,1]    0.56  1.5e-3   0.14   0.28   0.47   0.57   0.65   0.79   8343    1.0
brk[9,1]    0.62  1.4e-3   0.13   0.35   0.54   0.63   0.71   0.84   8284    1.0
brk[10,1]   0.58  1.8e-3   0.14   0.29    0.5   0.59   0.68   0.82   5972    1.0
brk[11,1]   0.69  1.3e-3   0.11   0.44   0.62    0.7   0.77   0.88   7290    1.0
brk[12,1]   0.55  1.6e-3   0.14   0.25   0.47   0.56   0.66    0.8   7345    1.0
brk[13,1]   0.54  1.5e-3   0.13   0.25   0.46   0.55   0.63   0.78   8338    1.0
brk[14,1]   0.57  1.5e-3   0.13   0.28   0.48   0.58   0.66    0.8   7591    1.0
brk[15,1]   0.63  1.3e-3   0.13   0.35   0.55   0.64   0.72   0.85   9096    1.0
brk[16,1]   0.48  1.9e-3   0.14   0.17   0.39   0.49   0.58   0.73   5522    1.0
brk[17,1]   0.65  1.4e-3   0.12   0.39   0.57   0.66   0.73   0.85   7401    1.0
brk[18,1]    0.6  1.6e-3   0.13   0.31   0.52   0.61   0.69   0.83   7077    1.0
brk[19,1]   0.57  1.4e-3   0.13   0.29   0.48   0.58   0.66   0.79   7971    1.0
brk[20,1]   0.65  1.3e-3   0.12   0.38   0.57   0.66   0.73   0.85   8833    1.0
brk[21,1]   0.63  1.4e-3   0.13   0.35   0.55   0.64   0.72   0.84   8331    1.0
brk[22,1]   0.63  1.3e-3   0.13   0.34   0.55   0.64   0.72   0.85   9614    1.0
brk[23,1]   0.65  1.4e-3   0.12   0.38   0.57   0.66   0.74   0.86   8293    1.0
brk[24,1]   0.65  1.3e-3   0.12   0.39   0.58   0.67   0.74   0.86   8908    1.0
brk[25,1]   0.58  1.5e-3   0.13   0.29   0.49   0.59   0.68   0.81   7523    1.0
brk[26,1]   0.68  1.3e-3   0.11   0.44   0.61   0.69   0.76   0.87   7495    1.0
brk[27,1]   0.49  2.2e-3   0.15   0.18   0.41    0.5    0.6   0.76   4586    1.0
brk[28,1]   0.57  1.6e-3   0.13   0.28   0.49   0.57   0.66    0.8   7067    1.0
brk[29,1]   0.58  1.7e-3   0.14   0.26   0.49   0.59   0.68   0.82   6552    1.0
brk[30,1]   0.33  2.2e-3   0.15   0.04   0.22   0.32   0.43   0.61   4404    1.0
brk[31,1]   0.07  7.6e-4   0.07 1.4e-3   0.02   0.05    0.1   0.24   7422    1.0
brk[32,1]   0.09  9.4e-4   0.08 2.6e-3   0.03   0.07   0.13   0.29   6777    1.0
brk[33,1]   0.06  6.8e-4   0.06 1.1e-3   0.02   0.04   0.09   0.22   7393    1.0
brk[34,1]   0.05  5.9e-4   0.05 1.0e-3   0.01   0.04   0.07   0.18   6869    1.0
brk[35,1]   0.05  5.7e-4   0.05 8.1e-4   0.01   0.03   0.07   0.17   6429    1.0
brk[36,1]   0.07  7.6e-4   0.07 1.7e-3   0.02   0.05    0.1   0.24   7693    1.0
brk[37,1]   0.09  8.9e-4   0.08 2.0e-3   0.03   0.06   0.13   0.28   7588    1.0
brk[38,1]   0.06  6.9e-4   0.06 1.6e-3   0.02   0.05   0.09   0.21   6888    1.0
brk[39,1]   0.13  1.3e-3    0.1 4.0e-3   0.05   0.11    0.2   0.38   5906    1.0
brk[40,1]   0.06  6.6e-4   0.06 1.3e-3   0.02   0.04   0.09   0.22   7508    1.0
brk[41,1]   0.06  7.0e-4   0.06 1.2e-3   0.02   0.05   0.09   0.22   7334    1.0
brk[42,1]    0.1  9.9e-4   0.08 2.6e-3   0.03   0.08   0.15   0.31   7208    1.0
brk[43,1]   0.07  7.7e-4   0.06 1.5e-3   0.02   0.05    0.1   0.23   6766    1.0
brk[44,1]   0.11  1.1e-3   0.09 2.8e-3   0.04   0.09   0.17   0.35   7210    1.0
brk[45,1]   0.08  8.4e-4   0.07 1.3e-3   0.02   0.05   0.11   0.26   7043    1.0
brk[46,1]    0.1  9.9e-4   0.08 2.8e-3   0.04   0.08   0.15   0.31   7339    1.0
brk[47,1]   0.09  9.0e-4   0.08 2.4e-3   0.03   0.07   0.13   0.29   7548    1.0
brk[48,1]    0.1  9.7e-4   0.08 2.2e-3   0.03   0.08   0.14   0.31   7581    1.0
brk[49,1]   0.09  8.8e-4   0.08 1.7e-3   0.03   0.07   0.12   0.28   7252    1.0
brk[50,1]   0.07  7.7e-4   0.06 1.4e-3   0.02   0.05    0.1   0.23   6767    1.0
brk[51,1]   0.06  7.2e-4   0.06 1.0e-3   0.02   0.04   0.08    0.2   5972    1.0
brk[52,1]   0.16  1.4e-3   0.12 5.3e-3   0.07   0.14   0.23   0.42   6433    1.0
brk[53,1]   0.07  7.2e-4   0.07 1.8e-3   0.02   0.06   0.11   0.24   8163    1.0
brk[54,1]   0.07  7.4e-4   0.06 1.8e-3   0.02   0.05   0.09   0.23   6724    1.0
brk[55,1]   0.06  6.3e-4   0.05 1.6e-3   0.02   0.04   0.08   0.19   6599    1.0
brk[56,1]   0.07  8.8e-4   0.07 1.2e-3   0.02   0.05   0.11   0.25   6228    1.0
brk[57,1]    0.1  9.9e-4   0.08 2.6e-3   0.03   0.08   0.14    0.3   6775    1.0
brk[58,1]   0.09  9.3e-4   0.08 2.4e-3   0.03   0.07   0.13   0.29   7067    1.0
brk[59,1]   0.11  1.2e-3   0.09 2.8e-3   0.04   0.09   0.17   0.34   6286    1.0
brk[60,1]   0.07  8.4e-4   0.07 1.5e-3   0.02   0.05   0.11   0.25   6371    1.0
brk[1,2]    0.38  1.5e-3   0.13   0.17   0.29   0.37   0.46   0.65   7088    1.0
brk[2,2]    0.47  1.6e-3   0.14   0.22   0.37   0.46   0.56   0.77   7827    1.0
brk[3,2]    0.44  1.5e-3   0.13   0.22   0.35   0.43   0.53   0.73   7343    1.0
brk[4,2]    0.49  1.6e-3   0.14   0.24   0.39   0.48   0.58   0.78   7216    1.0
brk[5,2]    0.34  1.3e-3   0.12   0.14   0.25   0.33   0.42    0.6   7996    1.0
brk[6,2]    0.44  1.6e-3   0.14    0.2   0.35   0.43   0.53   0.73   6942    1.0
brk[7,2]    0.34  1.2e-3   0.12   0.14   0.26   0.33   0.42   0.61   9616    1.0
brk[8,2]    0.44  1.5e-3   0.14   0.21   0.35   0.43   0.53   0.72   8343    1.0
brk[9,2]    0.38  1.4e-3   0.13   0.16   0.29   0.37   0.46   0.65   8284    1.0
brk[10,2]   0.42  1.8e-3   0.14   0.18   0.32   0.41    0.5   0.71   5972    1.0
brk[11,2]   0.31  1.3e-3   0.11   0.12   0.23    0.3   0.38   0.56   7290    1.0
brk[12,2]   0.45  1.6e-3   0.14    0.2   0.34   0.44   0.53   0.75   7345    1.0
brk[13,2]   0.46  1.5e-3   0.13   0.22   0.37   0.45   0.54   0.75   8338    1.0
brk[14,2]   0.43  1.5e-3   0.13    0.2   0.34   0.42   0.52   0.72   7591    1.0
brk[15,2]   0.37  1.3e-3   0.13   0.15   0.28   0.36   0.45   0.65   9096    1.0
brk[16,2]   0.52  1.9e-3   0.14   0.27   0.42   0.51   0.61   0.83   5522    1.0
brk[17,2]   0.35  1.4e-3   0.12   0.15   0.27   0.34   0.43   0.61   7401    1.0
brk[18,2]    0.4  1.6e-3   0.13   0.17   0.31   0.39   0.48   0.69   7077    1.0
brk[19,2]   0.43  1.4e-3   0.13   0.21   0.34   0.42   0.52   0.71   7971    1.0
brk[20,2]   0.35  1.3e-3   0.12   0.15   0.27   0.34   0.43   0.62   8833    1.0
brk[21,2]   0.37  1.4e-3   0.13   0.16   0.28   0.36   0.45   0.65   8331    1.0
brk[22,2]   0.37  1.3e-3   0.13   0.15   0.28   0.36   0.45   0.66   9614    1.0
brk[23,2]   0.35  1.4e-3   0.12   0.14   0.26   0.34   0.43   0.62   8293    1.0
brk[24,2]   0.35  1.3e-3   0.12   0.14   0.26   0.33   0.42   0.61   8908    1.0
brk[25,2]   0.42  1.5e-3   0.13   0.19   0.32   0.41   0.51   0.71   7523    1.0
brk[26,2]   0.32  1.3e-3   0.11   0.13   0.24   0.31   0.39   0.56   7495    1.0
brk[27,2]   0.51  2.2e-3   0.15   0.24    0.4    0.5   0.59   0.82   4586    1.0
brk[28,2]   0.43  1.6e-3   0.13    0.2   0.34   0.43   0.51   0.72   7067    1.0
brk[29,2]   0.42  1.7e-3   0.14   0.18   0.32   0.41   0.51   0.74   6552    1.0
brk[30,2]   0.67  2.2e-3   0.15   0.39   0.57   0.68   0.78   0.96   4404    1.0
brk[31,2]   0.93  7.6e-4   0.07   0.76    0.9   0.95   0.98    1.0   7422    1.0
brk[32,2]   0.91  9.4e-4   0.08   0.71   0.87   0.93   0.97    1.0   6777    1.0
brk[33,2]   0.94  6.8e-4   0.06   0.78   0.91   0.96   0.98    1.0   7393    1.0
brk[34,2]   0.95  5.9e-4   0.05   0.82   0.93   0.96   0.99    1.0   6869    1.0
brk[35,2]   0.95  5.7e-4   0.05   0.83   0.93   0.97   0.99    1.0   6429    1.0
brk[36,2]   0.93  7.6e-4   0.07   0.76    0.9   0.95   0.98    1.0   7693    1.0
brk[37,2]   0.91  8.9e-4   0.08   0.72   0.87   0.94   0.97    1.0   7588    1.0
brk[38,2]   0.94  6.9e-4   0.06   0.79   0.91   0.95   0.98    1.0   6888    1.0
brk[39,2]   0.87  1.3e-3    0.1   0.62    0.8   0.89   0.95    1.0   5906    1.0
brk[40,2]   0.94  6.6e-4   0.06   0.78   0.91   0.96   0.98    1.0   7508    1.0
brk[41,2]   0.94  7.0e-4   0.06   0.78   0.91   0.95   0.98    1.0   7334    1.0
brk[42,2]    0.9  9.9e-4   0.08   0.69   0.85   0.92   0.97    1.0   7208    1.0
brk[43,2]   0.93  7.7e-4   0.06   0.77    0.9   0.95   0.98    1.0   6766    1.0
brk[44,2]   0.89  1.1e-3   0.09   0.65   0.83   0.91   0.96    1.0   7210    1.0
brk[45,2]   0.92  8.4e-4   0.07   0.74   0.89   0.95   0.98    1.0   7043    1.0
brk[46,2]    0.9  9.9e-4   0.08   0.69   0.85   0.92   0.96    1.0   7339    1.0
brk[47,2]   0.91  9.0e-4   0.08   0.71   0.87   0.93   0.97    1.0   7548    1.0
brk[48,2]    0.9  9.7e-4   0.08   0.69   0.86   0.92   0.97    1.0   7581    1.0
brk[49,2]   0.91  8.8e-4   0.08   0.72   0.88   0.93   0.97    1.0   7252    1.0
brk[50,2]   0.93  7.7e-4   0.06   0.77    0.9   0.95   0.98    1.0   6767    1.0
brk[51,2]   0.94  7.2e-4   0.06    0.8   0.92   0.96   0.98    1.0   5972    1.0
brk[52,2]   0.84  1.4e-3   0.12   0.58   0.77   0.86   0.93   0.99   6433    1.0
brk[53,2]   0.93  7.2e-4   0.07   0.76   0.89   0.94   0.98    1.0   8163    1.0
brk[54,2]   0.93  7.4e-4   0.06   0.77   0.91   0.95   0.98    1.0   6724    1.0
brk[55,2]   0.94  6.3e-4   0.05   0.81   0.92   0.96   0.98    1.0   6599    1.0
brk[56,2]   0.93  8.8e-4   0.07   0.75   0.89   0.95   0.98    1.0   6228    1.0
brk[57,2]    0.9  9.9e-4   0.08    0.7   0.86   0.92   0.97    1.0   6775    1.0
brk[58,2]   0.91  9.3e-4   0.08   0.71   0.87   0.93   0.97    1.0   7067    1.0
brk[59,2]   0.89  1.2e-3   0.09   0.66   0.83   0.91   0.96    1.0   6286    1.0
brk[60,2]   0.93  8.4e-4   0.07   0.75   0.89   0.95   0.98    1.0   6371    1.0
lp__       -1207    0.15   5.97  -1220  -1211  -1207  -1203  -1197   1604    1.0

Samples were drawn using NUTS at Wed Jun  3 18:52:56 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Eaman · June 3, 2020, 5:55pm

You are right, it does finish with [0.9, 6]. But it seems it does not for [0.1,6], at least for the time I let it run (5000 samples). Again, I think this has to do with the gradient being too sharp for [0.1, 6]. I think this again proves the point that if there is a computational problem, there is a modeling problem. This is happening because the choice of prior is inappropriate. Thanks for help

Topic		Replies	Views
RuntimeError: Initialization failed Modeling	10	813	July 28, 2020
Issues with running survival model in Stan Modeling rstan , techniques , fitting-issues	4	469	May 24, 2021
pystan.StanModel RuntimeError: Initialization failed when trying to fit Modeling pystan , fitting-issues	2	887	May 4, 2020
Analyzing the posterior prediction samples Modeling	26	2033	October 27, 2021
Issue with Beta Priors in ODE model Modeling	2	656	February 14, 2020

RuntimeError: Something went wrong after call_sampler

Related topics