Exception: std::bad_alloc during execution

cesare · November 27, 2018, 12:16pm

I have a model where I defined the covariance matrix with 2 parameters that I want to infer. To do so,
I wrote the following stan code (that I’m running with pystan):

data {
int<lower=1> Mn;
int<lower=1> Mm;
int<lower=0> N;
row_vector[Mm] y[N];
matrix[Mm,3Mn] Umodes;
vector<lower=0>[Mm] singval;
matrix<lower=0>[Mn,Mn] dij;
}
transformed data {
matrix[Mm,Mm] diagSigma;
matrix[Mm,Mn] Umodes1;
matrix[Mm,Mn] Umodes2;
matrix[Mm,Mn] Umodes3;
diagSigma = diag_matrix(singval);
Umodes1[1:Mm,1:Mn] = Umodes[1:Mm,1:Mn];
Umodes2[1:Mm,1:Mn] = Umodes[1:Mm,(Mn+1):(2Mn)];
Umodes3[1:Mm,1:Mn] = Umodes[1:Mm,(2Mn+1):(3Mn)];
}
parameters {
row_vector[Mm] lambdaHat;
real var1;
real var2;
}
transformed parameters {
row_vector[Mm] lambdaCoeff=lambdaHatdiagSigma;
cov_matrix[Mm] sigmareduced; //reduced covariance
{
real l=exp(var1);
real nusq=exp(var2);
matrix[Mn,Mn] pcordsigma=exp(-(1.0/l)dij); // Full covariance for a single coordinate
matrix[Mm,Mn] V1=Umodes1pcordsigma;
matrix[Mm,Mn] V2=Umodes2pcordsigma;
matrix[Mm,Mn] V3=Umodes3pcordsigma;
sigmareduced=nusq(V1*(Umodes1’)+V2*(Umodes2’)+V3*(Umodes3’));
}
}
model {
lambdaHat ~ std_normal();
y ~ multi_normal(lambdaCoeff, sigmareduced);
}
generated quantities{
real nu=exp(var2/2);
real l=exp(var1);
}

When tested with a dataset of N=14 samples and, Mn=655, Mm=55 all worked fine. When I tried with a data set still formed by N=14 samples and with Mm=55 but with Mn=13569 I got the following error:

Exception: std::bad_alloc (in 'shapeInferenceReducedUnboundedLog.stan' at line 31) [origin: bad_alloc]

I guess the problem comes when stan tries to allocate the memory for the matrix pcordsigma; however, with Mn=13569 I’m expecting that the matrix requires 1.4Gb maximum, while the other data no more than 3 Gb,
hence a total of less than 5Gb. I would exclude my Desktop ran out of memory since it has 31.4 Gb of physical memory (and a swap memory od the same size).
Are there workarounds to run the code on my desktop machine with that problem size?

Thanks,

Cesare

jjramsey · November 27, 2018, 1:31pm

That’s not a safe assumption, since there can be memory overhead from the data structures used to store the data, memory used to store temporary copies or partial results, etc. Also, even if not all your memory is used up, there may not be a single contiguous chunk of 1.4 GB of free memory available.

~~Offhand, I’d suggest looking for approximations to your algorithm that are less memory hungry. Chapter 8 of Gaussian Processes for Machine Learning may be a good start.~~

ETA: Another place to start may be the article “Understanding Probabilistic Sparse Gaussian Process Approximations”. The negative of equation (5) in that article can be used to define a custom log-likelihood that you can use in place of multi_normal_lpdf. However, you’ll need the Woodbury identities (i.e., eqs. (A.9) and (A.10) in Gaussian Processes for Machine Learning) in order to evaluate the second and third terms of equation (5) without building an Mn \times Mn matrix.

ETA: After looking at your Stan code more closely, it looks like you already are using some kind of reduced covariance matrix, but you are still creating a very large temporary matrix pcordsigma in the process. See if you can rework your algorithm to avoid explicitly creating that matrix.

sakrejda · November 27, 2018, 7:10pm

Unfortunately after you consider the stack used for the auto-diff and the tree built when creating the HMC trajectory within each iteration, you end up with quite a few times more the memory use than you expected.

Topic		Replies	Views
Covariance matrix error because of float precision Modeling	2	567	November 1, 2018
Strange STAN error Modeling	7	1981	November 20, 2017
Initialization error Modeling	5	2465	June 11, 2017
Multivariate Bayesian Modelling via Stan Modeling rstan	4	519	April 28, 2021
Problem with block transformed data Modeling	1	906	December 26, 2017

Exception: std::bad_alloc during execution

Related topics