Trying to plot posterior Predictive plot using az.from_cmdstanpy()

Hello I am using CmdStanModel using Python.

The model has run well and I have all the summary I need. Here is the code:

LVR_model = CmdStanModel(stan_file=stan_file)
import logging
cmdstanpy_logger = logging.getLogger(“cmdstanpy”)
cmdstanpy_logger.disabled = True
LVR_model.compile()

fit_1 = LVR_model.sample(data=LVR_data, chains=4, iter_sampling=200,iter_warmup=200, adapt_delta=0.80, max_treedepth=10)

Fit_Summary.loc[[‘beta_pop’,‘beta_gdp’,‘beta_rate’,‘y_sigma’,‘local_error_sigma’,‘slope_error_sigma’,‘season_error_sigma’,‘nu_y’,‘nu_slope’,‘nu_trend’,‘nu_season’],:

Screenshot 2022-09-18 172117

So far so good, but when I am trying to run this idata_from_cmdstanpy = az.from_cmdstanpy(fit_1, posterior_predictive = [“y_predict”])

I am getting the following error:


ValueError Traceback (most recent call last)
File ~\anaconda3\lib\site-packages\cmdstanpy\stanfit\mcmc.py:127, in CmdStanMCMC.getattr(self, attr)
126 try:
→ 127 return self.stan_variable(attr)
128 except ValueError as e:
129 # pylint: disable=raise-missing-from

File ~\anaconda3\lib\site-packages\cmdstanpy\stanfit\mcmc.py:733, in CmdStanMCMC.stan_variable(self, var, inc_warmup)
732 if var not in self._metadata.stan_vars_dims:
→ 733 raise ValueError(
734 f’Unknown variable name: {var}\n’
735 'Available variables are ’
736 + ", ".join(self._metadata.stan_vars_dims)
737 )
738 if self._draws.shape == (0,):

ValueError: Unknown variable name: sample
Available variables are y_sigma, local_error_sigma, slope_error_sigma, season_error_sigma, season, nu, trend_slope, local_trend, nu_y, beta_pop, beta_gdp, beta_rate, nu_slope, nu_trend, nu_season, y_forecast, local_trend_forecast, trend_slope_forecast, season_forecast, y_predict, log_likelihood

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last)
Input In [57], in <cell line: 1>()
----> 1 idata_from_cmdstanpy = az.from_cmdstanpy(fit_1, posterior_predictive = [“y_predict”])

File ~\anaconda3\lib\site-packages\arviz\data\io_cmdstanpy.py:380, in from_cmdstanpy(posterior, posterior_predictive, predictions, prior, prior_predictive, observed_data, constant_data, predictions_constant_data, log_likelihood, coords, dims)
334 def from_cmdstanpy(
335 posterior=None,
336 *,
(…)
346 dims=None
347 ):
348 “”“Convert CmdStanPy data into an InferenceData object.
349
350 Parameters
(…)
378 InferenceData object
379 “””
→ 380 return CmdStanPyConverter(
381 posterior=posterior,
382 posterior_predictive=posterior_predictive,
383 predictions=predictions,
384 prior=prior,
385 prior_predictive=prior_predictive,
386 observed_data=observed_data,
387 constant_data=constant_data,
388 predictions_constant_data=predictions_constant_data,
389 log_likelihood=log_likelihood,
390 coords=coords,
391 dims=dims,
392 ).to_inference_data()

File ~\anaconda3\lib\site-packages\arviz\data\io_cmdstanpy.py:261, in CmdStanPyConverter.to_inference_data(self)
252 def to_inference_data(self):
253 “”“Convert all available data to an InferenceData object.
254
255 Note that if groups can not be created (i.e., there is no output, so
256 the posterior and sample_stats can not be extracted), then the InferenceData
257 will not have those groups.
258 “””
259 return InferenceData(
260 **{
→ 261 “posterior”: self.posterior_to_xarray(),
262 “sample_stats”: self.sample_stats_to_xarray(),
263 “posterior_predictive”: self.posterior_predictive_to_xarray(),
264 “predictions”: self.predictions_to_xarray(),
265 “prior”: self.prior_to_xarray(),
266 “sample_stats_prior”: self.sample_stats_prior_to_xarray(),
267 “prior_predictive”: self.prior_predictive_to_xarray(),
268 “observed_data”: self.observed_data_to_xarray(),
269 “constant_data”: self.constant_data_to_xarray(),
270 “predictions_constant_data”: self.predictions_constant_data_to_xarray(),
271 “log_likelihood”: self.log_likelihood_to_xarray(),
272 }
273 )

File ~\anaconda3\lib\site-packages\arviz\data\base.py:37, in requires.call..wrapped(cls, *args, **kwargs)
35 if all([getattr(cls, prop_i) is None for prop_i in prop]):
36 return None
—> 37 return func(cls, *args, **kwargs)

File ~\anaconda3\lib\site-packages\arviz\data\io_cmdstanpy.py:100, in CmdStanPyConverter.posterior_to_xarray(self)
93 invalid_cols = (
94 posterior_predictive
95 + predictions
96 + log_likelihood
97 + [col for col in columns if col.endswith(“__”)]
98 )
99 valid_cols = [col for col in columns if col not in invalid_cols]
→ 100 data = _unpack_frame(self.posterior.sample, columns, valid_cols)
101 return dict_to_dataset(data, library=self.cmdstanpy, coords=self.coords, dims=self.dims)

File ~\anaconda3\lib\site-packages\cmdstanpy\stanfit\mcmc.py:130, in CmdStanMCMC.getattr(self, attr)
127 return self.stan_variable(attr)
128 except ValueError as e:
129 # pylint: disable=raise-missing-from
→ 130 raise AttributeError(*e.args)

AttributeError: Unknown variable name: sample
Available variables are y_sigma, local_error_sigma, slope_error_sigma, season_error_sigma, season, nu, trend_slope, local_trend, nu_y, beta_pop, beta_gdp, beta_rate, nu_slope, nu_trend, nu_season, y_forecast, local_trend_forecast, trend_slope_forecast, season_forecast, y_predict, log_likelihood

Can anybody help please?

Thanks
Ants

@WardBrian @mitzimorris has these changed?

if hasattr(fit, "metadata") or hasattr(fit, "stan_vars_cols"):

What cmdstanpy and arviz versions do you have?

@ahartikainen to my knowledge those attributes still exist. Depending on what exactly Arviz is doing it might run afoul of the fact that fit.a is a synonym for fit.stan_variable (“a”), though?

Looking from the arviz source that type of source was created October 5th 2020 and then changed October 14th 2020.

No release has been created between those commits, so is this a custom install?

Edit. Actually 1409 did change the code away from that one. So probably you have a version from anaconda default/main channel?

Hello, Thanks for responding. I am the latest cmdstanpy, I just installed from cmdstanpy – Python interface to CmdStan — CmdStanPy 1.0.7 documentation

Same with the arviz, I installed yesterday.

Any reason why I am getting this error - sorry I do not understand all the programming code in this thread.

Any help will be appreciated.

Regards
Ants.

Can you do

pip install arviz -U

Hello Thanks, I have just done this through pip now and re ran the following code

When just run just this

idata = az.from_cmdstanpy(posterior=fit_1, posterior_predictive=[“y_predict”])

it worked :)

But when I ran with this code >

idata = az.from_cmdstanpy(posterior=fit_1, posterior_predictive=[“y_predict”], observed_data=[“y”])

It came up with following error:

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [9], in <cell line: 2>()
** 1 import arviz as az # do this in cmd via pip install arviz -U**
----> 2 idata = az.from_cmdstanpy(posterior=fit_1,posterior_predictive = [“y_predict”], observed_data = [‘y’])
** 3 idata**

File ~\AppData\Roaming\Python\Python39\site-packages\arviz\data\io_cmdstanpy.py:831, in from_cmdstanpy(posterior, posterior_predictive, predictions, prior, prior_predictive, observed_data, constant_data, predictions_constant_data, log_likelihood, index_origin, coords, dims, save_warmup, dtypes)
** 765 def from_cmdstanpy(**
** 766 posterior=None,**
** 767 ,*
** (…)**
** 780 dtypes=None,**
** 781 ):**
** 782 “”“Convert CmdStanPy data into an InferenceData object.**
** 783 **
** 784 For a usage example read the**
** (…)**
** 829 InferenceData object**
** 830 “””**
→ 831 return CmdStanPyConverter(
** 832 posterior=posterior,**
** 833 posterior_predictive=posterior_predictive,**
** 834 predictions=predictions,**
** 835 prior=prior,**
** 836 prior_predictive=prior_predictive,**
** 837 observed_data=observed_data,**
** 838 constant_data=constant_data,**
** 839 predictions_constant_data=predictions_constant_data,**
** 840 log_likelihood=log_likelihood,**
** 841 index_origin=index_origin,**
** 842 coords=coords,**
** 843 dims=dims,**
** 844 save_warmup=save_warmup,**
** 845 dtypes=dtypes,**
** 846 ).to_inference_data()**

File ~\AppData\Roaming\Python\Python39\site-packages\arviz\data\io_cmdstanpy.py:463, in CmdStanPyConverter.to_inference_data(self)
** 446 def to_inference_data(self):**
** 447 “”“Convert all available data to an InferenceData object.**
** 448 **
** 449 Note that if groups can not be created (i.e., there is no output, so**
** 450 the posterior and sample_stats can not be extracted), then the InferenceData**
** 451 will not have those groups.**
** 452 “””**
** 453 return InferenceData(**
** 454 save_warmup=self.save_warmup,**
** 455 {
** 456 “posterior”: self.posterior_to_xarray(),**
** 457 “sample_stats”: self.sample_stats_to_xarray(),**
** 458 “posterior_predictive”: self.posterior_predictive_to_xarray(),**
** 459 “predictions”: self.predictions_to_xarray(),**
** 460 “prior”: self.prior_to_xarray(),**
** 461 “sample_stats_prior”: self.sample_stats_prior_to_xarray(),**
** 462 “prior_predictive”: self.prior_predictive_to_xarray(),**
→ 463 “observed_data”: self.observed_data_to_xarray(),
** 464 “constant_data”: self.constant_data_to_xarray(),**
** 465 “predictions_constant_data”: self.predictions_constant_data_to_xarray(),**
** 466 “log_likelihood”: self.log_likelihood_to_xarray(),**
** 467 },**
** 468 )**

File ~\AppData\Roaming\Python\Python39\site-packages\arviz\data\base.py:65, in requires.call..wrapped(cls)
** 63 if all((getattr(cls, prop_i) is None for prop_i in prop)):**
** 64 return None**
—> 65 return func(cls)

File ~\AppData\Roaming\Python\Python39\site-packages\arviz\data\io_cmdstanpy.py:412, in CmdStanPyConverter.observed_data_to_xarray(self)
** 409 @requires(“observed_data”)**
** 410 def observed_data_to_xarray(self):**
** 411 “”“Convert observed data to xarray.”“”**
→ 412 return dict_to_dataset(
** 413 self.observed_data,**
** 414 library=self.cmdstanpy,**
** 415 coords=self.coords,**
** 416 dims=self.dims,**
** 417 default_dims=,**
** 418 index_origin=self.index_origin,**
** 419 )**

File ~\AppData\Roaming\Python\Python39\site-packages\arviz\data\base.py:306, in dict_to_dataset(data, attrs, library, coords, dims, default_dims, index_origin, skip_event_dims)
** 303 dims = {}**
** 305 data_vars = {}**
→ 306 for key, values in data.items():
** 307 data_vars[key] = numpy_to_data_array(**
** 308 values,**
** 309 var_name=key,**
** (…)**
** 314 skip_event_dims=skip_event_dims,**
** 315 )**
** 316 return xr.Dataset(data_vars=data_vars, attrs=make_attrs(attrs=attrs, library=library))**

AttributeError: ‘list’ object has no attribute ‘items’

https://python.arviz.org/en/latest/api/generated/arviz.from_cmdstanpy.html#arviz.from_cmdstanpy

For data you need to give a dictionary containing the used data. I don’t think cmdstanpy fit object has these.

Thanks once again.

so I made these changes

observed_data=LVR_data[‘y’]
idata = az.from_cmdstanpy(posterior=fit_1,posterior_predictive = [“y_predict”], observed_data = observed_data)

The above code ran okay, but when I tried to run

az.plot_ppc(idata, num_pp_samples=10,data_pairs={“y”: “y_predict”},mean=False)

Came up with the error message . I am might be doing something very silly here. Please guide me, I wil appreciate it.


KeyError Traceback (most recent call last)
File ~\AppData\Roaming\Python\Python39\site-packages\arviz\utils.py:71, in _var_names(var_names, data, filter_vars)
70 try:
—> 71 var_names = _subset_list(var_names, all_vars, filter_items=filter_vars, warn=False)
72 except KeyError as err:

File ~\AppData\Roaming\Python\Python39\site-packages\arviz\utils.py:149, in _subset_list(subset, whole_list, filter_items, warn)
148 if not np.all(existing_items):
→ 149 raise KeyError(f"{np.array(subset)[~existing_items]} are not present")
151 return subset

KeyError: ‘[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23\n 24 25 26] are not present’

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
Input In [35], in <cell line: 1>()
----> 1 az.plot_ppc(idata, num_pp_samples=10,data_pairs={“y”: “y_predict”},mean=False)

File ~\AppData\Roaming\Python\Python39\site-packages\arviz\plots\ppcplot.py:261, in plot_ppc(data, kind, alpha, mean, observed, color, colors, grid, figsize, textsize, data_pairs, var_names, filter_vars, coords, flatten, flatten_pp, num_pp_samples, random_seed, jitter, animated, animation_kwargs, legend, labeller, ax, backend, backend_kwargs, group, show)
259 var_names = _var_names(var_names, observed_data, filter_vars)
260 pp_var_names = [data_pairs.get(var, var) for var in var_names]
→ 261 pp_var_names = _var_names(pp_var_names, predictive_dataset, filter_vars)
263 if flatten_pp is None and flatten is None:
264 flatten_pp = list(predictive_dataset.dims.keys())

File ~\AppData\Roaming\Python\Python39\site-packages\arviz\utils.py:74, in _var_names(var_names, data, filter_vars)
72 except KeyError as err:
73 msg = " “.join((“var names:”, f”{err}", “in dataset”))
—> 74 raise KeyError(msg) from err
75 return var_names

KeyError: “var names: ‘[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23\n 24 25 26] are not present’ in dataset”

Not sure, make sure that observed data has the correct shape in idata.