Chains don't converge with extremely low effective sample size and location parameter is inf, but must be finite

Dear Stanners:

I am urgently working on an asset (stock) price model known as Log-Periodic Power Law (LPPL, LPPLS or JLS) model, but I am encountering the following exception warnings at the beginning of the sampling:

The current Metropolis proposal is about to be rejected because of the following issue:
Exception: lognormal_lpdf: Location/Scale parameter is inf, but must be finite!
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine, but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

I am not sure whether this warning is a serious issue since it only occurs at the beginning of sampling. However, the Markov chains don’t converge and the effective sample size is extremely low. I don’t know what causes this issue. Could anyone kindly help me identify where goes wrong in my model specification?

Many thanks!
Alan


The LPPL model is formulated as:

\mathrm{ln}p_t = A + B(t_c - t)^m + C(t_c - t)^m \cos(\omega \mathrm{ln}(t_c - t) - \phi) + \varepsilon_t

  • \mathrm{ln}p_t: log price at time t before t_c
  • t_c > \max(t): critical time (i.e., time of transition to a new regime or bubble termination)
  • A > 0: expected log price at the critical time t_c
  • B < 0: amplitude of the power law acceleration
  • -1<C<1: amplitude of the log-periodic oscillations
  • 0 < m < 1: degree of the power law growth of prices
  • \omega: frequency of oscillations during a bubble
  • 0 < \phi < 2\pi: time scale of the oscillations, a phase parameter
  • \varepsilon_t is the noise to the log price at time t

The Stan model is:

data{
    int<lower=0> N;           // number of observations (trading days)
    vector[N] t;              // time index
    vector<lower=0>[N] p;     // price
}

parameters{
    real<lower=0.001> A;
    real<upper=-0.001> B;
    real<lower=-0.999, upper=0.999> C;
    real<lower=max(t)+1> tc;
    real<lower=0.001, upper=0.999> m;
    real<lower=0.001> omega;
    real<lower=0, upper=2*pi()> phi;
    
    real<lower=0> sigma;  // standard deviation of noise of log price
}

model{
     // priors
    A ~ uniform(0.001, 50);
    B ~ uniform(-50, -0.001);
    C ~ uniform(-0.999, 0.999);
    tc ~ uniform(0, 200);
    m ~ uniform(0.001, 0.999);
    omega ~ uniform(0, 20);
    phi ~ uniform(0, 2*pi());
    
    sigma ~ normal(0, 50);
    
    // likelihood
    for (n in 1: N){
        p[n] ~ lognormal(A + B*(tc-t[n])^m + C*(tc-t[n])^m*cos(omega*log(tc-t[n])-phi), sigma);
    }
}

Here is the data file of the S&P 500 index price: SP500.csv (20.2 KB)

My Pystan code:

import pandas as pd

# Data for Stan model
data = pd.read_csv('SP500.csv')
data_stan = {'N': len(data),
             't': data['time_idx'],
             'p': data['price']
            }
# Compile the Stan model (model_code is the Stan model)
model = pystan.StanModel(model_code=model_code)

# Fit the Stan model
fit = model.sampling(data=data_stan,
                     chains=3, 
                     iter=4000,
                     warmup=2000,
                     seed=123,
                     n_jobs= 6,
                     control={'adapt_delta': 0.95, 'max_treedepth': 15}
                    )

Sounds like some bit of:

A + B*(tc-t[n])^m + C*(tc-t[n])^m*cos(omega*log(tc-t[n])-phi)

is blowing up.

The easiest way to debug this will be to print the different terms here and see if you can find the -nans.

Like:

print(A);
print(B*(tc-t[n])^m);
...

Once you find a term that is giving you nans, dig in to that. Maybe there’s a nan on the inputs or something? Or maybe there is something blowing up numerically.

1 Like

prime suspect is “uniform” distribution in the likelihood that may return inf.

Thanks so much for your answer! I tried to print out one by one but didn’t find any problematic values.

However, I found a more serious issue that the multiple chains don’t converge. I have played around with the priors and parameter contraints, but nothing worked.

Thanks for your reply.

Do you mean the uniform prior distribution? How can uniform priors lead to this issue?

Technically the lower bounded real<lower=0.001> omega doesn’t prevent proposal omega=inf , when that(specifically the proposed value is out of lower or upper bound), the uniform distribution returns inf, resulting in power function or cos function gives NaN. For a proper model this could happen in the initialization window so I wouldn’t worry about it, and I don’t see any issue in your model except maybe instead of y[n] you mean p[n].

Thanks for pointing out the typo. So I guess this inf/NaN during initialization is not an issue, yet I don’t know why the multiple chains don’t converge and the sampling efficiency is extremely low given the low effective sample size.

I would be really grateful if you can help me have a look at the convergence issue.

The way you use uniform as prior is not stan idiomatic, see
https://statmodeling.stat.columbia.edu/2017/11/28/computational-statistical-issues-uniform-interval-priors/
and

1 Like

Thank you very much. The model worked after I have reparameterized it by taking out the angle parameter \phi and then used all normal priors. Also I found the performance depends significantly on the initial values, which means I have to manually specify reasonable inital values in order for faster sampling and convergence.