Creating a vector of parameters with zero length

Hi all,

This is a toy example that illustrates my problem.

Using rstan, I would like to fit a simple linear regression model with (possibly) zero covariates. This is the generated data

N <- 21
X <- as.matrix(cbind(rep(x = 1, times = N), seq(from = 1, to = 10, length.out = N)))
a <- 1.5
b <- 1.5
y <- X %*% c(a, b) + rnorm(n = N)

# X_prime <- as.matrix(X[, 2]) # This is not being used
X_prime <- as.matrix(data.frame(row.names = 1:N)) # This is a matrix with 0 columns

data_stan <- list(N = N,
                  n_predictor = ncol(X_prime),
                  X = X_prime,
                  y = c(y))

This is the .stan model

data {
  int<lower=0> N;
  int<lower=0> n_predictor;
  matrix[N, n_predictor] X;
  vector[N] y;
}

parameters {
  real a;
  vector[n_predictor] b;
  real sigma;
}

model {
  sigma ~ normal(0, 1);
  y ~ normal(a + X * b, sigma);
}

But when trying to fit the model with no covariates

fit <- stan(file = "model.stan", data = data_stan) 

I got the following error

SAMPLING FOR MODEL 'model' NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: multiply: m1 must have a positive size, but is 0; dimension size expression = cols()  (in 'model455b31f81ef4_model' at line 16)

[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                                                                            
[2] "  Exception: multiply: m1 must have a positive size, but is 0; dimension size expression = cols()  (in 'model455b31f81ef4_model' at line 16)"
error occurred during calling the sampler; sampling not done

The error message seems to indicate that I should not create a parameter vector with zero length, but this post says otherwise.

Is it possible to have a design matrix with (possibly) zero columns?

[edit: this actually works, but I implied it was a bug when first writing]

Hi, @avramaral and welcome to the Stan forums. Here’s an example of how you can specify zero sizes. Stan’s internal math can handle the zero sizes, as you can see with this.

parameters {
  real<lower=0, upper=1> alpha;
} transformed parameters {
  vector[0] a;
  matrix[2, 0] B;
  vector[2] c = B * a;
  print("c = ", c);
}

and the output is what you’d expect,

Chain 1 c = [0,0] 

Of course, the entries are always zero here, as they’re a sum of zero terms. So this looks like we get the boundary condition right.

So I suspect the problem is with specifying a matrix in RStan. Here’s how you can do it in cmdstanr and in RStan.

cmdstanr does the right thing:

data {
  matrix[2, 0] x;
}
parameters {
  real<lower=0, upper=1> alpha;
} transformed parameters {
  vector[0] a;
  vector[2] c = x * a;
  print("c = ", c);
}
model <- cmdstan_model('foo.stan')
fit <- model$sample(data = list(x = matrix(0, 2, 0)))

and so does RStan

fit <- stan('foo.stan', data = list(x = matrix(0, 2, 0))

I have no idea what as.matrix does—the problem may be there.

@Bob_Carpenter, thank you for your response.

Unfortunately, even your example threw a similar error when I tried to execute it, as you can see below

// foo.stan
data {
  matrix[2, 0] x;
}
parameters {
  real<lower=0, upper=1> alpha;
} transformed parameters {
  vector[0] a;
  vector[2] c = x * a;
  print("c = ", c);
}
library(rstan)
fit <- stan('foo.stan', data = list(x = matrix(0, 2, 0))) 
SAMPLING FOR MODEL 'foo' NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: multiply: m1 must have a positive size, but is 0; dimension size expression = cols()  (in 'model55911c102d1e_foo' at line 8)

[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                                                                         
[2] "  Exception: multiply: m1 must have a positive size, but is 0; dimension size expression = cols()  (in 'model55911c102d1e_foo' at line 8)"
error occurred during calling the sampler; sampling not done

I am not sure what could be the problem here, but that is my sessionInfo()

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS 12.5

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] rstan_2.21.3         ggplot2_3.3.6        StanHeaders_2.21.0-7

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3       pillar_1.7.0       compiler_4.1.0    
 [4] prettyunits_1.1.1  tools_4.1.0        pkgbuild_1.2.0    
 [7] lifecycle_1.0.1    tibble_3.1.7       gtable_0.3.0      
[10] pkgconfig_2.0.3    rlang_1.0.2        DBI_1.1.2         
[13] cli_3.3.0          parallel_4.1.0     loo_2.4.1         
[16] gridExtra_2.3      withr_2.5.0        dplyr_1.0.9       
[19] generics_0.1.2     vctrs_0.4.1        stats4_4.1.0      
[22] grid_4.1.0         tidyselect_1.1.2   glue_1.6.2        
[25] inline_0.3.19      R6_2.5.1           processx_3.5.3    
[28] fansi_1.0.3        purrr_0.3.4        callr_3.7.0       
[31] magrittr_2.0.3     codetools_0.2-18   matrixStats_0.60.0
[34] scales_1.2.0       ps_1.6.0           ellipsis_0.3.2    
[37] assertthat_0.2.1   colorspace_2.0-3   utf8_1.2.2        
[40] RcppParallel_5.1.5 munsell_0.5.0      crayon_1.5.1 

Do you have any idea about where the problem might be? Or what I should do to further test it.

I’m using RStan 2.26 (see getting started on how to install). It may be a problem with 2.21.

Generally, I’d recommend switching to cmdstanr if you only need to fit models and don’t need to access the log density or gradient functions directly. It’s more up to date with Stan and easier to install.

Updating rstan (and StanHeaders) to version “2.26.1” solved the problem. I am sorry, I should have checked it before. And thank you for your help.