How to feed into row_vector[k] from pystan

I am running this tutorial using pystan
http://data.princeton.edu/pop510/hospStan.html

one of the input data is row_vector[K] x[N]; // predictors

I read out the txt file using numpy.genfromtxt, and converted the 4 designated columns to an array of tuples. Just to try it out, knowing its prob not correct. There is no obvious way to convert to array of row_vectors

The pystan.stan() call fails with
accessing element out of range. index 0 out of range;

I can’t find any where in the manual where it explains how to feed into a row_vector[K] from a python client. Can someone please help explain?

So after you have read the data from hospital.txt with np.genfromtxt you have NxD numpy array (floats).

import numpy as np
hosp = np.genfromtxt("./hospital.txt", skip_header=1, skip_footer=1)

Then you can slice the data to dictionary almost the same way as the example will show you.

# R (list)
hosp_data <- list(N=nrow(hosp),M=501,K=4,y=hosp[,1],x=hosp[,2:5],g=hosp[,6])

# Python (dictionary)
hosp_data = dict(N=len(hosp), M=501, K=4, y=hosp[:, 0].astype(int), x=hosp[:, 1:5], g=hosp[:, 5].astype(int))

The x is a NxK numpy array.

Thank you sir. Works like a charm.

could you also help me with
print(hfit, pars=c(“alpha”,“beta[1]”,“beta[2]”,“beta[3]”,“beta[4]”,“sigma”),

  •   probs = c(0.025, 0.50, 0.975), digits_summary=3)
    

and
traceplot(hfit,c(“alpha”,“beta[1]”,“beta[2]”,“beta[3]”,“beta[4]”,“sigma”),
ncol=1,nrow=6,inc_warmup=F)

I don’t find the pystan equivalent. I tried hfit.summary(), hfit.traceplot(). Nothing gets dumped out.
This is apparently one of the challenges of working with pystan instead of rstan. Its hard to find references.

You can do print(hfit). The problem with this is that you can not define params.
Also hfit.plot(pars=('alpha', 'beta')) gives you the traceplots, but again, the problem is that you can not define cols / rows. So ‘beta’ variables are plotted in the same figure.

These are known issues. (see https://github.com/stan-dev/pystan/issues/357 & https://github.com/stan-dev/pystan/issues/201 for print(fit) and for the plot problem idea is that we update our plotting code and move to use mcmcplotlib.

Hi, I made a PR#359 (see link) to enable the vars in print (or close to that).

print(pystan.misc._print_stanfit(fit, pars=['alpha', 'beta[0]', 'sigma'], probs=(0.025, 0.50, 0.975), digits_summary=3))

For current situation you can create a pandas dataframe from summary.

import pandas as pd
summary_dict = hfit.summary()
summary_df = pd.DataFrame(data=summary_dict['summary'], 
                          index=summary_dict['summary_rownames'], 
                          columns=summary_dict['summary_colnames'])

or for each chain

chain_summary_list = []
for chain in range(summary_dict['c_summary'].shape[-1]):
    table_chain = pd.DataFrame(data=summary_dict['c_summary'][:, :, chain],
                               index=["{}_{}".format(name, chain) for name in  summary_dict['c_summary_rownames']], 
                               columns=summary_dict['c_summary_colnames'])
    chain_summary_list.append(table_chain)

#call specific chain summary 
chain[0]
# or create a dataframe from the subdataframes
summary_ = pd.concat(chain_summary_list, axis=0)

I hope these examples will work for your.

I am going to back off pyStan in favor of RStan until pyStan catches up with the plot and print functions a bit.