Generated Quantities for 0 dim vectors/matrices

ph-rast · October 21, 2019, 11:00pm

Dear stan users, I’m running into a problem with gqs() when generating from a stan model that may or may not contain some parameters.

I fit a model that contains a False/True flag matrix[flag ? nt : 0, flag ? nt : 0 ] phi;. If flag=FALSE, the parameter phi will not be estimated. Correctly, In the fitted stan model it will show as ..$ phi : num[0 , 0 ].
In the as.matrix(stan_fit), however, phi will not show up.
As a result, gqs() will complain: Exception: Variable phi missing ...

Any idea on how to have get around this problem?

Cheers,
Philippe

ph-rast · October 22, 2019, 5:43pm

Here’s a minimal exmple that illustrates my problem (I'm on the rstan development branch):

library(rstan)
m <- stan_model(model_code ='
data {
  int<lower = 0, upper = 1> flag;
}
parameters { 
  real y; 
  vector[flag ? 1 : 0] phi;
}
model {
  if ( flag == 1 ) {
    phi ~ normal(10, 0.1);
    y ~ normal(phi, 1);
  } else if (flag == 0) {
    y ~ normal(0, 1);
    }
}')
f <- sampling(m, data =  list( flag =  1 ))

## Inference for Stan model: c8cae5506057d06542f71d41e99c8846.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##         mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## y      10.01    0.02 0.99  8.12  9.34  9.99 10.65 11.99  3757    1
## phi[1] 10.00    0.00 0.10  9.81  9.94 10.00 10.07 10.20  3024    1
## lp__   -0.96    0.02 0.96 -3.54 -1.33 -0.65 -0.27 -0.02  1578    1
## 
## Samples were drawn using NUTS(diag_e) at Tue Oct 22 10:11:13 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

And with flag = 0

f <- sampling(m, data =  list( flag =  0 ))

## Inference for Stan model: c8cae5506057d06542f71d41e99c8846.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##       mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## y     0.02    0.03 1.03 -1.99 -0.68  0.02  0.72  1.98  1175    1
## lp__ -0.53    0.02 0.75 -2.58 -0.68 -0.24 -0.06  0.00  1445    1
## 
## Samples were drawn using NUTS(diag_e) at Tue Oct 22 10:11:18 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

So far, so good: When flag = 0, phi is a null-dimensional vector and not included in the output – nor in the as.matrix(f)

GQS part:

First, model is compiled with vector[flag ? 1 : 0] phi; commented out. Flag will be always 0:

mc <-'
data {
  int<lower = 0, upper = 1> flag;
}
parameters {
  real y;
  //  vector[flag ? 1 : 0] phi;
}
generated quantities {
    real y_rep;
    y_rep = normal_rng(y, 1);
}'
m2 <- stan_model(model_code = mc)
f2 <- rstan::gqs(m2, draws = as.matrix(f), data =  list( flag =  0))

f2

## Inference for Stan model: 17e2d78e13de006222877a7663a21f94.
## 1 chains, each with iter=4000; warmup=0; thin=1; 
## post-warmup draws per chain=4000, total post-warmup draws=4000.
## 
##       mean se_mean   sd  2.5%   25%  50%  75% 97.5% n_eff Rhat
## y_rep 0.02    0.04 1.46 -2.83 -0.96 0.02 1.03  2.87  1641    1
## 
## Samples were drawn using  at Tue Oct 22 10:11:25 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

This works because phi was commented out by hand.

Now, vector[flag ? 1 : 0] phi; is left in model, flag is still flag = 0

mc <-'
data {
  int<lower = 0, upper = 1> flag;
}
parameters {
  real y;
  vector[flag ? 1 : 0] phi;
}
generated quantities {
    real y_rep;
    y_rep = normal_rng(y, 1);
}'
m2 <- stan_model(model_code = mc)

## recompiling to avoid crashing R session

Problem:

Now let’s call gqs(), with flag = 0

f2 <- rstan::gqs(m2, draws = as.matrix(f), data =  list( flag =  0))

This returns Exception: Variable phi missing (in 'model49802b31ca9a_ea9a19bd60c8bdb46970b36a506c1c97' at line 7)

The summary indicates that nothing gets evaluated.

maxbiostat · October 22, 2019, 8:08pm

I’ve changed the tag to more accurately reflect the content of the question. Please let me know if you disagree. I’ll also tag @bgoodri to make sure he sees this [he might be busy, though].

bgoodri · October 24, 2019, 1:12am

Yeah, I don’t know how to handle that case at the moment.

martinmodrak · October 24, 2019, 10:27am

So this looks like a bug in rstan, could you please file an issue for it (provided it was not already reported). Thanks and sorry we can’t help you more. (a workaround might be to instead have a matrix of size 1 when not estimating the parameter and give normal(0,1); distribution so that it is as easy as possible for the sampler).

mitzimorris · November 4, 2019, 5:35am

what happens if you run sampling with flag=0 and also specify initial parameter value for y?
does this produce a similar error?

bgoodri · November 4, 2019, 5:53am

I think it is possible to specify an initial value as an empty vector / matrix / array, but the interface to standalone_gqs takes a big matrix of draws which doesn’t have a column when the parameter vector / matrix / array is empty.

mitzimorris · November 4, 2019, 12:37pm

I took a closer look at the code and added my comments to the issue - https://github.com/stan-dev/rstan/issues/708

we could implement a workaround, which would set the value of missing parameters to 0 if the sample doesn’t contain a column for that param. this puts the burden on the user to provide a valid sample, that is, a sample that was generated from a model which corresponds to whatever’s going on in the generated quantities block.

mitzimorris · November 4, 2019, 2:31pm

had coffee, see the problem and solution -

ph-rast · November 4, 2019, 6:04pm

I guess this request is obsolete? I can still try

mitzimorris · November 4, 2019, 7:22pm

request obsolete - many thanks

Topic		Replies	Views
Generated Quantiles / sampler in R with parameter vector of length 1 RStan	10	988	June 24, 2020
Function gqs() not working properly with transformed parameters in RStan 2.19.9? RStan	2	671	November 7, 2019
Problem Simulating Data with Generated Quantities (Dimension mismatch in assignment; type = real; right-hand side type = real[ ]) RStan rstan , techniques , specification	3	854	April 29, 2021
Issues in my stan code Modeling rstan , fitting-issues	8	548	January 26, 2021
Inspecting data fit of stan model/ problems with generated quantities code block Modeling rstan	4	492	October 27, 2022

Generated Quantities for 0 dim vectors/matrices

GQS part:

Problem:

Related topics