About Pystan Convergence Issue

LeonCaesa · August 9, 2020, 3:57pm

Dear all,

I generated a hierarchical model data X by using some parameter true \theta^* then I fitted pystan on a hierarchical model using the generated data X.

I initialized the starting param (\theta_{stan}) to be the true parameter \theta^*, but after the pystan model converges, I had the parameter converging to some wrong value \theta_{stan}.

When I calculate the log_prob by using the fit.log_prob() function, I found the true param \theta^* indeeds have higher log prob but still the model converges to a low log prob param \theta_{stan}

Does anyone knows why? It seems that if the pystan is using the log_prob as it target maximization objective, it should find some \theta_{stan} close to the true \theta^*.

LeonCaesa · August 9, 2020, 4:00pm

Here is the convergence diagnosis, the true sigma2 is 0.0155868 and true v[0] is 1.76456398

Here is the log_prob comparison

ahartikainen · August 9, 2020, 4:35pm

Hi,

hmc/nuts do not maximize against log_prob, but our optimize method does.

See for example 15.1 Hamiltonian Monte Carlo | Stan Reference Manual

and

LeonCaesa · August 9, 2020, 5:53pm

Thanks for the reply. I just tried sm.optimizing(data=…,init=init_list)

But it seems still that it found the sub-optimal points, isn’t that wired as the \theta^* is my init_list?

ahartikainen · August 9, 2020, 6:04pm

Sorry, hard to tell without a model (stan code)

How did you define the init?

LeonCaesa · August 9, 2020, 6:07pm

Sure,

here my stan code:

ppca_code = """
data { 
    int D; //number of dimensions
    int N; //number of data
    int Q; //number of principle components
    vector[D] x[N]; //data
    real a_vj; // w_j prior 
    real epsilon;// w_j mean
    real a_sigma2; // sigma2 prior 
    real beta_sigma2;// sigma2 mean

}

transformed data {
    matrix[D,D] S;
    S = x[1] * x[1]';
    
    for (n in 2:N){
    S += x[n] * x[n]';
    }
    S = S/N;
    
}
parameters {
    ordered[Q] v; // v_j
    real<lower=0> sigma2; //data sigma2
    matrix[Q,D] W; //projection matrix
}
model {
    matrix[D,D] C; //covaraince matrix
    for(j in 1:Q){
        v[j] ~ inv_gamma(a_vj, epsilon * (a_vj -1));
        W[j] ~ multi_normal(rep_vector(0,D), v[j] * diag_matrix(rep_vector(1, D)));
        }
        
    sigma2 ~ inv_gamma(a_sigma2, beta_sigma2);
    
    C = crossprod(W)+ sigma2 * diag_matrix(rep_vector(1, D));
    

    target += -  N/2 *(log_determinant (C) + trace( C\S));
}

generated quantities {
    vector[D] y_hat[N]; //predictive
    for (n in 1:N) {
        y_hat[n] = multi_normal_rng(rep_vector(0,D), crossprod(W)+ sigma2 * diag_matrix(rep_vector(1, D)));
    }
}

"""

And here is how I initialize the param:

init_list = []
for i_ in range(n_chains):
    temp_dict = {

         'v': sorted(v_star_list),
        'sigma2': sigma2_star,
         "W": W_star.T

    }
    init_list.append(temp_dict)

And after I ran

fit_standard = sm.optimizing(data=ppca_dat_standard,init=init_list)

I have the fit_standard found some value that is different from my init_list, which is the true \theta^* that has higher log_prob

ahartikainen · August 9, 2020, 6:21pm

btw, I just wanted to be a bit more specific about optimize

Topic		Replies	Views
About Pystan Convergence Issue Modeling	1	579	August 13, 2020
The myth of log_prob from Stan model? PyStan	7	960	December 7, 2020
Unexpected log_prob() behavior Developers fitting-issues	5	615	April 20, 2023
Stan doesn't like (near-) zeros? Modeling pystan	10	4138	November 17, 2021
Initialization failed in PyStan Modeling	3	739	March 10, 2020

About Pystan Convergence Issue

Related topics