Partially missing data Rhat warning

I’m new to stan (and Bayesian analysis in general) and trying to create a model that has some partially missing data. I’m basically following the guide and creating a matrix in the transformed parameters block that combines the known values with parameters for the unknown values. However, I get lots of warnings indicating that Rhat is nan. I read in this thread that values that don’t change across chains (or something to this effect) might cause this, which would make sense given that many of the values are known and not fixed. Is it safe to assume that I can ignore these warnings in this case?

Regardless of whether it is ok to ignore the warnings, this also causes a problem when trying to plot the fit (in pystan), because of a singular matrix error, which I am assuming is caused by the fixed values in the transformed parameters. Any advice on this?

Yes, if the NaNs correspond to the fixed elements of the larger matrix. I would tend to declare and define the larger matrix in the model block, in which case it won’t show up in the output.

I doubt it, but we need to see the error message and ideally the code.

Without seeing the code I assume that singular matrix error comes from the kde code, which is broken if there is only one unique value.

ArviZ should have that fixed.

Hi Reid,

Although I can’t provide too much assistance to this, I am also trying to create a model dealing with some missing data. At this point I have not yet gotten as far as you. I’ll keep my eye out on this thread to see if you post the model; however, if you would be willing to share it I would be very interested in seeing how you advanced to get the model running.

Good luck!

-Alex

Hi thanks for the help. Moving the combined matrix to the model block solves both problems. For reference here is the stack trace when the combined matrix was defined in the transformed parameters block:

2018-12-29 11:13:15,464 [pystan]:16 WARNING Deprecation warning. In future, use ArviZ library (`pip install arviz`)
Traceback (most recent call last):
  File "/home/reid/git/worthy/spa/cli/stan/model.py", line 246, in <module>
    train(cli.parse_args())
  File "/home/reid/git/worthy/spa/cli/stan/model.py", line 223, in train
    fit.plot()
  File "stanfit4anon_model_d46feaaf38d27c2a4f036c4e851e48e4_2212204427526507839.pyx", line 518, in stanfit4anon_model_d46feaaf38d27c2a4f036c4e851e48e4_2212204427526507839.StanFit4Model.plot
  File "/home/reid/git/worthy/.venv/lib64/python3.6/site-packages/pystan/plots.py", line 27, in traceplot
    return plots.traceplot(values, pars, **kwargs)
  File "/home/reid/git/worthy/.venv/lib64/python3.6/site-packages/pystan/external/pymc/plots.py", line 41, in traceplot
    kdeplot_op(ax[i, 0], d)
  File "/home/reid/git/worthy/.venv/lib64/python3.6/site-packages/pystan/external/pymc/plots.py", line 55, in kdeplot_op
    density = kde.gaussian_kde(d)
  File "/home/reid/git/worthy/.venv/lib64/python3.6/site-packages/scipy/stats/kde.py", line 172, in __init__
    self.set_bandwidth(bw_method=bw_method)
  File "/home/reid/git/worthy/.venv/lib64/python3.6/site-packages/scipy/stats/kde.py", line 499, in set_bandwidth
    self._compute_covariance()
  File "/home/reid/git/worthy/.venv/lib64/python3.6/site-packages/scipy/stats/kde.py", line 510, in _compute_covariance
    self._data_inv_cov = linalg.inv(self._data_covariance)
  File "/home/reid/git/worthy/.venv/lib64/python3.6/site-packages/scipy/linalg/basic.py", line 975, in inv
    raise LinAlgError("singular matrix")
numpy.linalg.linalg.LinAlgError: singular matrix

I tried installing ArviZ, but I still get the same error if I leave the combined matrix in the transformed parameters block.

Edit: I installed ArviZ through pip, but I still get the deprecation warning, so I’m not sure about the previous statement anymore.

Hi Alex,

I don’t think my model is much to learn from, but you might find these resources useful, which walk through some of the details and helped me.

  1. The section on partially known parameters from the Stan Guide.
  2. This blog post.
1 Like

Use

import arviz as az
az.plot_trace(fit)

The fit.plot() will use old pymc3 code.