AR1 Error: cannot allocate vector

I am trying to run an AR1 model and I get no errors from the code below until I run the last line of code, which is running the stan model. The error I get is “Error: cannot allocate vector of size 62 Kb.” I have tried to find solutions on this forum and it seems that RTools35 can be a problem, but I don’t seem to have it.

AR1<-"
data {
  int<lower=0> N;
  vector[N] y;
}
parameters {
  real alpha; 
  real rho;
  real <lower=0> sigma;
}
model {
  for (n in 2: N)
    y[n] ~ normal (alpha + rho*y[n-1], sigma);
}
generated quantities {
  int<lower=0, upper=1> mean_gt;
  int<lower=0, upper=1> sd_gt;
  int<lower=0, upper=1> max_gt;
  vector[N] log_lik;
  vector[N] y_rep;
  log_lik[1]=y[1];
  for (n in 2:N){
  log_lik[n]= normal_lpdf(y[n] |alpha + rho*y[n-1],sigma);
  }    
  y_rep[1]=y[1];
  for (n in 2:N) {
  y_rep[n]= normal_rng(alpha + rho*y[n-1], sigma) ;
  }
  mean_gt= mean(y_rep) > mean(y); // p-value for mean
  sd_gt = sd(y_rep) > sd(y); // p-value for sd
  max_gt = max(y_rep) > max(y); // p-value for max
}
"
y<- log(data$sale/data$xbar)
N=nrow(data)
dat <- list(N=nrow(data), y=y)

#Running the model and examining the results
options(mc.cores = parallel::detectCores())
fit.1 <- stan(model_code=AR1, data=dat, iter=10000, warmup=2000, chains=3)
****
1 Like

When it’s an rtools error the vector size error tends to be impossibly large (like 6000GB), so I think this might be an actual RAM error.

Do you get the same error if you drop the number of iterations (try 5000)? You generally don’t need as many iterations with Stan (HMC) as you would with Gibbs/MH, so you can safely drop the number of iterations down unless it’s a pretty complex likelihood.

Also, what size is N and how much RAM are you working with?

1 Like

I dropped it down to 5000 with 1000 warmup. And I still get the error: Error: cannot allocate vector of size 94 Kb
Maybe this is worth mentioning, but when I run the 3 chains and they all end, the lines below. And it gets stuck there for a long time.
N=31292 and I have 16 gigs of RAM.

Chain 1: Elapsed Time: 37.645 seconds (Warm-up)
Chain 1: 146.849 seconds (Sampling)
Chain 1: 184.494 seconds (Total)
Chain 1:
Chain 3: Iteration: 4500 / 5000 [ 90%] (Sampling)
Chain 2: Iteration: 4500 / 5000 [ 90%] (Sampling)
Chain 3: Iteration: 5000 / 5000 [100%] (Sampling)
Chain 3:
Chain 3: Elapsed Time: 39.021 seconds (Warm-up)
Chain 3: 164.994 seconds (Sampling)
Chain 3: 204.015 seconds (Total)
Chain 3:
Chain 2: Iteration: 5000 / 5000 [100%] (Sampling)
Chain 2:
Chain 2: Elapsed Time: 39.254 seconds (Warm-up)
Chain 2: 169.388 seconds (Sampling)
Chain 2: 208.642 seconds (Total)
Chain 2:

fit.1 <- stan(model_code=AR1, data=dat, iter=5000, warmup=1000, chains=3)
***
1 Like

Ah, that sounds like a slightly different issue. I believe this could be due to the generated quantities in that case. Can you try commenting/removing the generated quantities block and re-running?

1 Like

Removing the generated quantities block no longer creates that error and the model runs much faster than before.

1 Like

Yep, so that would indicate that it is a RAM issue. The generated quantities block in RStan is (usually) run after all sampling has completed, because it’s numerically much simpler than the other blocks (no gradients/autodiff required). With an N of 31292, R is trying to store 31292 * 2 (yrep and log_lik) * (5000 + 1000) * 3 (chains) = 1,126,512,000 GQ estimates in RAM, which is pretty heavy.

You can try running with fewer chains (i.e., using 2 chains will only require 2/3 of the estimates to store), or you can change to the cmdstanr interface, which writes the estimates directly to the disk rather than trying to store them all in RAM during the estimation. However, this comes with the caveat that you might still run out of RAM when reading the estimates back into R

1 Like

So, I took your advice and I have lower the iterations and chains but I am still running into the same problem. The only way I have been able to get the code to run is if I lower the iterations to 1000 with a warmup of 500, and even then when I run the code

S <- ggs(fit.1)
***
And try to get a traceplot I get the error of: Error in mcmc.list(x) : Arguments must be mcmc objects
1 Like

I recommend you put a prior on rho, when it goes outside the abs(rho) > 1 range your model can misbehave, maybe some overflow. You can use a uniform (-1, 1) prior or some transform on a beta prior.