Hi I am trying to create my own custom distribution function in PyStan (version = 2.19.1.1). This is just a toy example for me to learn how to make custom distribution functions, so I didn’t put too much thought into choice of prior etc. I feel like it should be simple, but I keep getting an error when I try to compile. The error I see is as follows:
ValueError: Failed to parse Stan model ‘anon_model_bf95be6829e9dc127f6cab4591949241’. Error message:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
*No matches for: *
- real ~ newexp(real)*
Available argument signatures for newexp:
- vector ~ newexp(real)*
Real return type required for probability function.
I believe I should be returning a Real at the end of my custom function… but maybe i just don’t understand what is actually happening. Can anyone help?
Code used to generate a toy dataset:
import numpy as np
import scipy.stats as stats
#Hyperparams:
lmbda = [0.01,0.1,1]
num_states = len(lmbda) #total number of states our model will have
#Parameters
toy_scale = [];
for l in lmbda:
#draw a random value from exponential to determine states' scale parameter
toy_scale.append(stats.expon.rvs(loc = 0, scale = 1/l))
toy_shape = [0.5,0.5,0.5] #all the shapes will be assumed the same (0.5)
#Generate a toy dataset and state indicators
toy_data = []; toy_data_state = []; #data and state indicator (respectively)
len_state = 100 #number of data points belonging to each state
for i in range(num_states):
#each states parameters
temp_shape = toy_shape[i] #shape of weibull
temp_scale = toy_scale[i] #scale of weibull
#generate a sequence of random variables drawn from the weibull
toy_data.extend(stats.weibull_min.rvs(temp_shape,loc = 0, scale = temp_scale,size = len_state))
#label the states
toy_data_state.extend(np.repeat(i,len_state))
#Convert from list to array (to put into STAN)
toy_data = np.array(toy_data)
toy_data_state = np.array(toy_data_state)
Code used for Pystan modeling (gives me error message)
#Pystan modeling
example_code = """
functions{
real newexp_lpdf(vector x, real lam){
vector[num_elements(x)] prob;
real lprob;
for (i in 1:num_elements(x)){
prob[i] = lam*exp(-lam * x[i]);
}
lprob = sum(log(prob));
return(lprob);
}
}
data {
int<lower=0> N; // number of observations
int<lower=0> K; // number of keys
int<lower = 0, upper = K> state_id[N]; // data's group membership based on state
vector<lower=0>[N] y; // observed data
vector<lower=0>[K] lam1; //known hyperparam
}
parameters {
vector<lower=0>[K] scale;
}
model {
for (k in 1:K){
scale[k] ~ newexp(lam1[k]);//hyper parameter depends on the group
}
for (n in 1:N){
y[n] ~ weibull(0.5, scale[state_id[n]]); //likelihood depends on the group (state_id)
}
}
"""
example_dat = {'N': len(toy_data),
'K': num_states,
'state_id': toy_data_state + 1, #note: we +1 here because actually Stan indexes beginning at 1, while python indexes beginning at 0
'y': toy_data,
'lam1': lmbda}
sm = pystan.StanModel(model_code=example_code)
Additional information:
- Operating System: Windows 10
- Python Version: 3.7.5
- PyStan Version: 2.19.1.1
- Compiler/Toolkit: Databricks