Stochastic Volatility Model does not always converge

Hi guys,

I have issues with a stochastic volatility model I wrote. I run the model on different subsets of the data (i.e. different markets) and different frequencies (yearly vs quarterly). I have the following observations;

  1. Yearly gives me no issues.
  2. Quarterly converges nicely for some markets, but not all (about a 1/3).

The stochastic volatility is in a state Eq. of a repeat sales model. With a repeat sales model I subtract the price when the good is sold from the price when it is bought (in logs, so this results in a log return). The change in price is modeled in a state Eq. as a random walk with time-varying signal. The signal is also modeled as a random walk. Moreover, I also model the noise in the measurement Eq. as a random walk. I take the exponent of the signal/noise to ensure no negative variance parameters. Here is the code;

model_string = "
data {
	int<lower=0> N;    // number of observations
	int<lower=0> Nt;   // number of time periods
	vector[N] yVar;     // log returns of object i
	int sell[N];             // index of when is sold (1,...,T)
	int buy[N];            // index of when is bought (1,...,T), with buy < sell
parameters {
	real<lower=0> 	        sigEpsH; // sigma time-varying noise
	real<lower=0> 	        sigEtaH; // sigma time-varying signal
	vector[Nt-1] 	        innovMu;        // innovations of index
	vector[Nt-2] 	        innovEta;       // innovations of signal
	vector[Nt-1] 	        innovEps;      // innovations of noise
	real	                pEta;                     // signal[t=1]
	real	                pEps;                    // noise[t=1]
transformed parameters {
	vector[Nt] 	mu;
	vector[Nt-1]	sigEta;
	vector[Nt]	sigEps;

	sigEta[1]	= 	pEta;
	sigEps[1]	= 	pEps;
	for(t in 2:(Nt-1))
	 sigEta[t]	=	sigEta[t-1] + innovEta[t-1]*sigEtaH;
        for(t in 2:Nt)
	 sigEps[t]	=	sigEps[t-1] + innovEps[t-1]*sigEpsH;

	mu[1]	  = 0;
	for(t in 2:Nt)
	 mu[t]	  = mu[t-1] + innovMu[t-1]*exp(sigEta[t-1]);
model {
	sigEtaH	        ~	normal(0,1);
	sigEpsH	        ~	normal(0,1);
	innovMu		~	normal(0,1);
	innovEta	        ~	normal(0,1);
	innovEps	        ~	normal(0,1);
	pEta	 	        ~	cauchy(0,5);
	pEps		~	cauchy(0,5);
	yVar		        ~ 	normal(mu[sell] - mu[buy],
				        exp(sigEps[sell]) + exp(sigEps[buy]) );
generated quantities {
        vector[Nt-1] signal;
        vector[Nt]   noise;

	vector[Nt] index;
	vector[N]  log_lik;  

        signal = exp(sigEta);
        noise  = exp(sigEps);

	index = exp(mu)*100;
	for(i in 1:N)
	log_lik[i] = 	normal_lpdf(yVar[i]|mu[sell[i]] - mu[buy[i]],
				exp(sigEps[sell[i]]) + exp(sigEps[buy[i]]) );

The amount of iterations / chains and adapt_delta / max_tree_dept are all fine and probably have nothing to do with it, as the estimates (and Rhat) really explode (I sometimes get estimates of in the 1M, where it should be something like 0.05!). My guess is that there are too many parameters going over the same index, and all are modeled random walks. In essence a collinearity issue. Problem is that I do not really want to change the model itself. (And it does work on a yearly frequency.) Is there any transformation trick that you guys know that could help me out? I also noticed that if I put more stringent priors (or truncation) in the model block (parameter block), it can also converge, but again, I don’t want to do this…

As usual, many thanks,

Hi avdminne,

One of the issues might be that you’re taking the exponent of the signal/noise. If you don’t want the numbers to be negative you can just add a <lower=0> to the parameter declaration. Additionally, because you’re doing the subtraction of the price variables on the log scale, this is a division of the prices: \ln(x) - \ln(y) = x/y. So make sure that’s consistent with your aim

Also, from my (limited) experience with time series models in Bayes, one of the side-effects of a non-stationary time series is that the estimates ‘explode’ and get extremely large. I’m not familiar enough with these types of models to say whether its the case here or how to debug, but @Charles_Driver is the creator of the excellent ctsem package and he might have a tip or two for you here

Not sure off hand sorry, but multiple exponentiated random walks for one observed process does kind of give me the feeling that things might need to be tightened down… simple thing to try would be to use log1p(exp(x)) instead of exp(x)

1 Like

Dear Andrew,

It was my understanding that it is also ill-advised to truncate variables in the transformed parameter block? (Because that is where this parameter resides.) That it impacts mixing / effective sample sizes? Otherwise, I would obviously love to do this.

Also about the ‘explosion’; I do think this might occur more often if you model the trend non-stationary, like taking up a (time-varying or not) drift. I don’t think I have ever had this issue with ``simple’’ random walks.

Dear Charles,

Ah great. Never knew this function existed, but it might work indeed.
I will let you know if this (or the previous by @andrjohns) works. It’s a bit annoying, because most of the times the model does converge, so I get a lot of ``false positive’’ or so to speak. But I’ll run a couple of markets.

Edit: I just realized that even with this transformation the values can still be negative… So not sure if this works without truncating the transformed parameter after all…

While truncating parameters is discouraged, the <lower=0> isn’t a truncation, it just means that it samples a transformation of the parameters that does not allow the sampled values to be smaller than zero. See this section of the manual for more info:

Truncations are implemented using the T[,] syntax, like so:

y ~ normal(mu, sigma) T[0,];

But as you know, are discouraged

1 Like

Wow, I recognize some BUGS/JAGS code there (with the T[,] function).
I never completely realized this difference, even though I have used Stan for a few years now. This solves my issue, so many thanks!

log1p(exp(x)) cannot be negative.

Oops, you are right, I misread the function.
Its ln(1 + exp(x) ) in my case, which indeed cannot be less than 0.
My apologies, will try this one out now as well, as I keep having issues at times.

Hi Andrew,

one last quick question. Running the model with constraints works, but sometimes it causes;

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: validate transformed params: sigEta[2] is -2.46424, but must be greater than or equal to 0  (in 'model3c1813a71fc_416990c9f9e0a485e6932ff7f254ca8d' at line 20)

If the iterations are long enough it will typically “catch on”, but sometimes it doesn’t. Is there a way around it? Should I hardwire positive initial values?

That’s because while the output of the transformation (sigEta[t-1] + innovEta[t-1]*sigEtaH) is constrained to be positive, the parameters that make up the transformation are not constrained to values that will result in a positive number.

Given that sigEta is constrained to positivity, you need to constrain the product of (innovEta[t-1]*sigEtaH) to be positive, which means that you would either need to constrain both to be positive or both to be negative