Turn off caching in rvar

I’m using {posterior} 1.6.0 and I’m seeing the cache for a column of rvars in a data.frame ballooning to something like 400G! The actual data is a few MB. Is there anyway to turn off caching? I tried setting the cache attribute to an empty env but that doesn’t fix it. BTW, I’m using posterior inside a function being run on {targets}. I’m using spread_rvar to get my data and then I’m using inner_join on that data.frame, if that explains things.

I check sizes using lobstr::obj_size()

I took a closer look at the column of rvars. I attached the cache environment.

> vec_proxy[[1]] |> str()
List of 3
 $ index  : int 1
 $ nchains: int 4
 $ draws  : num [1:8000, 1:252] -0.0027 0.01855 -0.00226 -0.01987 0.04545 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:8000] "1" "2" "3" "4" ...
  .. ..$ : NULL
> vec_proxy[[2]] |> str()
List of 3
 $ index  : int 2
 $ nchains: int 4
 $ draws  : num [1:8000, 1:252] -0.0027 0.01855 -0.00226 -0.01987 0.04545 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:8000] "1" "2" "3" "4" ...
  .. ..$ : NULL
> vec_proxy[[3]] |> str()
List of 3
 $ index  : int 3
 $ nchains: int 4
 $ draws  : num [1:8000, 1:252] -0.0027 0.01855 -0.00226 -0.01987 0.04545 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:8000] "1" "2" "3" "4" ...
  .. ..$ : NULL

This data frame has 252 rows each with an rvar that has 8000 samples. Notice that the vec_proxy variable has the entire data repeated 252 times.

Wow I’ve never seen it that extreme. Maybe rvar expert @mjskay will know what’s going on.

See here

1 Like

Awesome, thanks! That clears it up then. In case someone hits this problem using the {targets} package, this is what I did:

rvar_safe_qs2_format <- tar_format(
  read = \(path) qs2::qs_read(path),
  marshal = \(object) object, 
  unmarshal = \(object) object, 
  
  write = function(object, path) {
    if (tibble::is_tibble(object)) {
      object <- dplyr::mutate(object, across(where(posterior::is_rvar), posterior:::invalidate_rvar_cache))
    }
    
    qs2::qs_save(object, path)
  },
)

tar_option_set(format = rvar_safe_qs2_format)
1 Like