Strange STAN error

[Please include Stan program and accompanying data if possible]
Hi everyone,
I am trying to run the following model in STAN, but I keep getting the following error:

SAMPLING FOR MODEL ‘0a1537c56d8525befcb90d37601c1368’ NOW (CHAIN 1).
[1] "Error in sampler$call_sampler(args_list[[i]]) : std::bad_alloc"
error occurred during calling the sampler; sampling not done

Please help.
Thank you.

Below is my code:

Model

library(rstan)

ModelNew <- "
data {
int<lower=1> N; // sample size
int<lower=1> n1;// n1 features in dat1
int<lower=1> n2;// n1 features in dat1
int<lower=1> d; // lower dimens.
int X[N,n1]; // dat1
int Y[N,n2]; // dat2
real<lower=0> scale_global ; // scale for the half -t prior for tau // ( tau0 = scale_global * sigma )
real<lower=1> nu_global; // degrees of freedom for the half-t prior for tau
real<lower=1> nu_local; // degrees of freedom for the half-t priors for lambdas // ( nu_local = 1 corresponds to the horseshoe )
}

parameters {
real<lower=0> sig_te; // sd errors
real<lower=0> sig_lb; // sd errors
matrix[d,N] Z;
matrix[N,n1] teta;
matrix[N,n2] lambd;
cholesky_factor_cov[d,n1] A;
cholesky_factor_cov[d,n2] B;

row_vector[n1] mux;
row_vector[n2] muy;

real<lower=0> taua[d]; // global shrinkage parameter
matrix<lower=0>[d,n1] lamba ; // local shrinkage parameters

real<lower=0> taub[d]; // global shrinkage parameter
matrix<lower=0>[d,n2] lambb; // local shrinkage parameters
}

// transformed param. parts
transformed parameters {
matrix[N,n1] mu_te;
matrix[N,n2] mu_lb;

mu_te = Z’*A;
mu_lb = Z’*B;

}

// Model Part
model{

// half-t prior for tau
taua ~ cauchy(0, scale_global);
taub ~ cauchy(0, scale_global);

sig_lb ~ cauchy(0, 2.5); //
sig_te ~ cauchy(0, 2.5); //

mux ~ normal(0, 10);
muy ~ normal(0, 10);
//lamba ~ cauchy(0, 1);
//lambb ~ cauchy(0, 1);

for(i in 1:d) {
Z[i] ~ normal(0, 1);
lamba[i] ~ cauchy(0, 1);
lambb[i] ~ cauchy(0, 1);
A[i] ~ normal(0, lamba[i] * taua[i]);
B[i] ~ normal(0, lambb[i] * taua[i]);
}

for(i in 1:N)
{
teta[i] ~ normal(mu_te[i] + mux, sig_te);
X[i] ~ poisson_log(teta[i]);

lambd[i] ~ normal(mu_lb[i] + muy, sig_lb);
Y[i] ~ poisson_log(lambd[i]);
}
}
"

compile the model

CompModel <- stan_model(model_code=ModelNew)

#Simulate data

N =n= 30 # sample size
p1 = n1= 10 # Number of features in data set 1
p2 = n2= 10 #Number of features in data set 2
d = 2 # lower dimension - projection dimension

We adopt a factorization of the joint covariance matrix

Only 2 component have high

A = matrix(0,ncol=d,nrow=p1)
A[2:3,] = rnorm(n=4,mean=4)

Only 2 component have high

B = matrix(0,ncol=d,nrow=p2)
B[c(2,4),] = rnorm(n=4,mean=4)

sigte=1
siglb=1
X_fac=rep(1,p1)
Y_fac=rep(1,p2)

Simulate count data

XY_Simulate <- function(n, d, A, B,sige,sige){

Z = matrix(rnorm(nd),nrow=d)
tet = 2 + A%
%Z + matrix(rnorm(n=p1n,sd=sige),ncol=n)
lamb = 2+ B%
%Z + matrix(rnorm(n=p2*n,sd=sige),ncol=n)

X=apply(exp(tet),c(1,2),rpois, n=1) #rpois(n=1,lambda = 10*exp(tet))
Y=apply(exp(lamb),c(1,2),rpois, n=1)

return(list(“X”=X, “Y”=Y))
}

fit the model to the data

set.seed(12134)
out <- XY_Simulate(n, d, A, B,sige,siglb)
X = out$X
Y = out$Y
cat("\n Make the data sets! \n")
bcca_dat<- list(“X”=t(X),“Y”=t(Y),“n1”=p1,“n2”=p2,“N”=N,“d”=d,“scale_global”=scale_global,“nu_global”=1,“nu_local”=1) ### correct_poi_scale4.stan 29 bet~ inv_gamma(2,1) – shrinking

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
fit = sampling(CompModel, data=bcca_dat, pars=c(“A”,“B”,“sig_te”,“sig_lb”),chains=1,iter=1000,control = list(adapt_delta = 0.99,max_treedepth = 15))

Googling around std::bad_alloc apparently is a C++ error thrown when things are out of memory.

Your model doesn’t look too big, but could you try running with iter = 1? rstan saves all the samples in memory, so you gotta have enough memory to store all those samples.

Autodiff can also require a big hunk of memory too. If you change iter to 1 and still get bad_allocs, then maybe there is something in the way the autodiff is working that takes a bunch of memory?

You can enclode code and other things in to pairs of triple tick marks “```” to make the formatting nice.

I tried with one same and I am still getting the same error. On my mac I get the following error:

SAMPLING FOR MODEL ‘9e6159bdfba8458840b146ad44f309ae’ NOW (CHAIN 1).
[1] "Error in sampler$call_sampler(args_list[[i]]) : std::bad_alloc"
error occurred during calling the sampler; sampling not done

On my windows machine I get:
SAMPLING FOR MODEL ‘475c94a8b5bb9a4e4d60a7d5ec19320e’ NOW (CHAIN 1).
[1] "Error in sampler$call_sampler(args_list[[i]]) : “
[2] " vector::_M_range_check: __n (which is 662) >= this->size() (which is 654)”
[1] “error occurred during calling the sampler; sampling not done”

Had a look at this.

I think the issue is how the prior is specified on A and B. The current code looks something like this:

parameters {
  cholesky_factor_cov[d,n1] A;
  cholesky_factor_cov[d,n2] B;
  ...
}

model {
  for(i in 1:d) {
    A[i] ~ normal(0, lamba[i] * taua[i]);
    B[i] ~ normal(0, lambb[i] * taua[i]);
  }
  ...
}

I don’t think those priors make sense on A and B. Not sure what it means, but if you change the declaration of A and B to be:

parameters {
  matrix[d,n1] A;
  matrix[d,n2] B;
  ...
}

then the model will run. At the very least it doesn’t seem like just a bad allocation happening.

I’m out of my element here. @bgoodri, does any of this mean anything to you?

vector::_M_range_check: __n (which is 662) >= this->size() (which is 654)

The original R code doesn’t work, but this is telling you that you are trying to access the 662-nd element of something that is size 654.

trying to access the 662-nd element of something that is size 654.

Oh, reading the whole thing. Interesting strategy

2 Likes

I am sorry I sent a code that does not work. This one should run. But I still get the same issue on both my mac and windows machine:
SAMPLING FOR MODEL ‘c16254d5001402d87d4921f5cd8cea37’ NOW (CHAIN 1).
[1] “Error in sampler$call_sampler(args_list[[i]]) : std::bad_alloc”
[1] “error occurred during calling the sampler; sampling not done”

@bbbales2 - The idea is to specify a shrinkage prior on the element of A and B - In essence, I am trying a horseshoes type prior on the entries of A and B. I would like to estimate A and B sparse.
You see below that the second approach works - at runs. But the first approach does not. I get the error message above.

Thank you for your time and effort.
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%

First Approach

#%%%%%%%%%%%%%%%%%%%%%%%%%%%%
library(rstan)

# Simulate count data

XY_Simulate <- function(n, d, n1,n2, A, B, mux, muy,sigte,siglb){
  

  
  Z = matrix(rnorm(n*d),nrow=d)
  tet = mux + A%*%Z + matrix(rnorm(n=n1*n,sd=sigte),ncol=n)
  lamb = muy + B%*%Z + matrix(rnorm(n=n2*n,sd=siglb),ncol=n)
  
  X=apply(exp(tet),c(1,2),rpois, n=1) #rpois(n=1,lambda = 10*exp(tet))
  Y=apply(exp(lamb),c(1,2),rpois, n=1)
  
  return(list("X"=X, "Y"=Y))
}

#Stan code
Model1 <- "
data {
int<lower=1> N; // sample size
int<lower=1> n1;// n1 features in dat1
int<lower=1> n2;// n1 features in dat1
int<lower=1> d; // lower dimens. 
int X[N,n1];    // dat1
int Y[N,n2];    // dat2
real<lower=0> scale_global ; // scale for the half -t prior for tau // ( tau0 = scale_global * sigma )
real<lower=1> nu_global;    // degrees of freedom for the half-t prior for tau
real<lower=1> nu_local;    // degrees of freedom for the half-t priors for lambdas // ( nu_local = 1 corresponds to the horseshoe )
}

parameters {
real<lower=0> sig_te; // sd errors
real<lower=0> sig_lb; // sd errors
matrix[d,N] Z;
matrix[N,n1] teta;
matrix[N,n2] lambd;
cholesky_factor_cov[d,n1] A; 
cholesky_factor_cov[d,n2] B; 

row_vector[n1] mux;
row_vector[n2] muy;

real<lower=0> taua[d];              // global shrinkage parameter
matrix<lower=0>[d,n1] lamba ; // local shrinkage parameters

real<lower=0> taub[d];             // global shrinkage parameter
matrix<lower=0>[d,n2] lambb; // local shrinkage parameters
}

// transformed param. parts
transformed parameters {
matrix[N,n1] mu_te;
matrix[N,n2] mu_lb;

mu_te = Z'*A;
mu_lb = Z'*B;

}

// Model Part
model{

// half-t prior for tau
taua ~ cauchy(0, scale_global);
taub ~ cauchy(0, scale_global);

sig_lb ~ cauchy(0, 2.5); //
sig_te ~ cauchy(0, 2.5); //

mux ~ normal(0, 5);
muy ~ normal(0, 5);



for(i in 1:d) { 
Z[i] ~ normal(0, 1);
lamba[i] ~ cauchy(0, 1);
lambb[i] ~ cauchy(0, 1);
A[i] ~ normal(0, lamba[i] * taua[i]);
B[i] ~ normal(0, lambb[i] * taub[i]);
}


for(i in 1:N)
{
 teta[i] ~ normal(mu_te[i] + mux, sig_te);
    X[i] ~ poisson_log(teta[i]);

 lambd[i] ~ normal(mu_lb[i] + muy, sig_lb);
     Y[i] ~ poisson_log(lambd[i]);
}
}
"

#% compile the stan model 
CompModel1 <- stan_model(model_code=Model1)

#Toy data
N = 30      # sample size
n1= 10    # Number of features in data set 1
n2= 10    # Number of features in data set 2  
d  = 2         # lower dimension - projection dimension

A = matrix(0,ncol=d,nrow=n1)
A[2:3,] = rnorm(n=4,mean=4)

B = matrix(0,ncol=d,nrow=n2)
B[c(2,4),] = rnorm(n=4,mean=4) 

sigte=.1
siglb=.1
mux = 1.0
muy = 1.0

#Simulate data 
set.seed(12134)
out <- XY_Simulate(N, d,n1 ,n2, A, B, mux, muy, sigte,siglb)
X = out$X
Y = out$Y


Data<- list("X"=t(X),"Y"=t(Y),"n1"=n1,"n2"=n2,"N"=N,"d"=d,"scale_global"=1,"nu_global"=1,"nu_local"=1) 

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
fit = sampling(CompModel1, data=Data, pars=c("A","B","sig_te","sig_lb"),chains=1,iter=100,control = list(adapt_delta = 0.99,max_treedepth = 15))

#%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Second Approach

#%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Model2 <- "

data {
int<lower=1> N; // sample size
int<lower=1> n1;
int<lower=1> n2;
int<lower=1> d;
int X[n1,N];
int Y[n2,N];
real<lower=0> scale_global ; // scale for the half -t prior for tau // ( tau0 = scale_global * sigma )
real<lower=1> nu_global; // degrees of freedom for the half-t prior for tau
real<lower=1> nu_local; // degrees of freedom for the half-t priors for lambdas // ( nu_local = 1 corresponds to the horseshoe )
}

parameters {

real<lower=0> sig_te; // sd errors
real<lower=0> sig_lb; // sd errors
matrix[d,N] Z;

matrix[n1,N] teta;
matrix[n2,N] lambd;

cholesky_factor_cov[n1,d] A; 
cholesky_factor_cov[n2,d] B; 

real mux[n1];
real muy[n2];

real<lower=0> taua[n1]; // global shrinkage parameter
matrix<lower=0>[n1,d] lamba ; // local shrinkage parameters

real<lower=0> taub[n2]; // global shrinkage parameter
matrix<lower=0>[n2,d] lambb ; // local shrinkage parameters
}

// transformed param. parts
transformed parameters {
matrix[n1,N] mu_te;
matrix[n2,N] mu_lb;

mu_te = A*Z;
mu_lb = B*Z;

}

// Model Part
model{

for(m in 1:d) 
Z[m] ~ normal(0, 1);

// half -t prior for tau
taua ~ cauchy(0, scale_global );
taub ~ cauchy(0, scale_global );

mux ~ normal(0, 10);
muy ~ normal(0, 10);

sig_lb ~ cauchy(0,2.5); //sig_lb ~ scaled_inv_chi_square(10, .05);

sig_te ~ cauchy(0,2.5); //sig_te ~ scaled_inv_chi_square(10, .05);


for(i in 1:n1)
{
 lamba[i] ~ cauchy(0,1);
  A[i] ~ normal(0, lamba[i]* taua[i]);
 teta[i] ~ normal(mu_te[i] + mux[i], sig_te);
 X[i] ~ poisson_log(teta[i]);
}

for(i in 1:n2)
{
  B[i] ~ normal(0, lambb[i] * taub[i]);
 lambd[i] ~ normal(mu_lb[i] + muy[i], sig_lb);
 Y[i] ~ poisson_log(lambd[i]);
}
}
"

# compile the model 
CompModel2 <- stan_model(model_code=Model2)

#Toy data
N = 30      # sample size
n1= 10    # Number of features in data set 1
n2= 10    # Number of features in data set 2  
d  = 2         # lower dimension - projection dimension

A = matrix(0,ncol=d,nrow=n1)
A[2:3,] = rnorm(n=4,mean=4)

B = matrix(0,ncol=d,nrow=n2)
B[c(2,4),] = rnorm(n=4,mean=4) 

sigte=.1
siglb=.1

mux = 1.0
muy = 1.0

#Simulate data 
set.seed(12134)
out <- XY_Simulate(N, d,n1 ,n2, A, B,mux, muy, sigte,siglb)
X = out$X
Y = out$Y
scale_global = 1
Data <- list("X"=X,"Y"=Y,"n1"=n1,"n2"=n2,"N"=N,"d"=d,"scale_global"=scale_global,"nu_global"=1,"nu_local"=1) 

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
fit = sampling(CompModel2, data=Data, pars=c("A","B","sig_te","sig_lb"),chains=2,iter=100,control = list(adapt_delta = 0.99,max_treedepth = 15))

I am sorry I sent a code that does not work.

It’s fine, happens plenty.

My question @bgoodrich was more about what does cholesky_factor_cov[n1,d] A; mean, and what priors in Stan might work with it.

The original way the priors are set up doesn’t seem to be compatible with whatever assumptions go into cholesky_factor_cov[N1, N2], N2 != N1 which seems to be causing some segfault sorta errors.

I had a look in the manual and didn’t offhand see examples of what cholesky_factor_cov[N1, N2] is doing. Presumably it’s some sort of approximation of an actual Cholesky factorization of something? But I don’t know. Seems like a strange thing to me. Always good to be clear on exactly how approximations are working anyway.

Maybe make a new thread and ask about examples of using cholesky_factor_cov[N1, N2]. I’m out of my element here :D.