Save a summary of a vector without saving the vector?

What I’m trying to do: I’m producing a few large vectors (call them Mu1, Mu2 … MuN) so I can take advantage of vectorized sampling. I don’t really want to save these vectors, but (for reasons we can get into if necessary) I would like to know the variance of the vectors (e.g. var(Mu1) for each draw).

Currently, I define the vectors in the transformed parameters block and I compute the variance in the generated quantities block, but this leads to saving the vectors on each draw.

I could declare the vectors within the model block, and then they would not be saved to disk. I believe (realizing as I write that I haven’t tested this assumption) that variables defined in the model block are out of scope in the generated quantities block, so I could not compute the variances.

To answer the obvious question why I don’t want to save the vectors: It isn’t that big of a deal to save them. The drawbacks are that they are filling up my hard drive, they make loading the fit from disk take a long time, and they make the cmdstan ‘diagnose’ utility take a long time to run. Once my model is 100% working great, that’s not such a big deal, but it’s a bit inconvenient as I work to develop the model.

1 Like

Yeah this is a tricky situation. If you’re more worried about memory than speed then you can avoid saving the big vectors by creating them in the transformed parameters block (inside a local scope so they aren’t saved) and then again in the model block. So something like this:

transformed parameters {
  // only save the variance, but you'll need to create the big vectors to compute the variance 
  real var;
  { // start local scope
    vector[N] big_vector = make_big_vector(...); // defined in functions block
    var = variance(big_vector);
  }
}
model {
  // make the big vector again so you can use it 
  // it doesn't exist anymore because it was created in a local scope within transformed parameters
  vector[N] big_vector = make_big_vector(...);
}

So this requires creating the big vector(s) multiple times but should only result in the variance being saved.

Someone else might have a better solution, but that’s the first thing that occurred to me that might work.

1 Like