I fitted a behavioral data set with 150 subjects and 700 trials each using an hierarchical RL model with stan (code). This gets me from the transformed parameters block a log_like matrix (subjects x trials), which I then try to use with loo.
rl_fit<- stan(file = my_model,
pars = "log_lik",
save_psis = FALSE,
cores = 4,
moment_match = FALSE,
k_threshold = 0.7)
I then get this:
Error: cannot allocate vector of size 3.8 Gb
In addition: Warning message:
Some Pareto k diagnostic values are too high. See help(‘pareto-k-diagnostic’) for details.
Any idea whats going on and how I can fix it?
I’m by far not the most solid help you might find on this forum, but as your question has gone unanswered for some time: The error seems just to say that the size of the output from loo exceeds what R can allocate to memory. That is usually not a problem, so it might signal something is not right.
I had a brief look at your code, and I couldn’t quite figure out how the log_lik matrix coming out of your code is appropriate to use for the loo function. Are you confident that it’s the right quantity? Usually, it’s calculated in the generated quantities block, as the log-probability of the observations conditional on the parameters.
Also, what are you trying to use loo for? Estimating the predictive precision of your model for a new subject, or something else?
Thank you so much for this. I am trying to use loo for model comparison (at this point only in simulations).
So on the more technical side, the loo documentation notes that the first argument can be a “A log-likelihood array, matrix, or function”, so I figured this should be ok… (but obviously I’m doing something wrong).
I am estimating log_lik for N subjects with T trials and I was wondering whether it might be appropriate/helpfull to aggregate over trials (and then have an N vector with the sum of log_lik). Do you have any idea whether this should make a difference for psis-loo estimation?
Yes - I can def calculate log_lik in the generated quantities block. But will it make a difference in the actual estimates (other then speeding up the code - which is also very useful)?
I can’t really follow what your model is doing, but as long as you are getting reasonable fits to simulated data and understand it yourself, I guess all is well. And apart from the computational gain, I don’t think there is a specific reason to have it in the generated quantities. Also, a lot of (most?) models aren’t set up with the log-likelihood of the data as a parameter.
Anyway, as it stands now, I think you are doing “leave-one-trial-out”, which estimates the predictive utility of your model for a new trial, given all the other observations for a subject. That may or may not be what you want. Summing the log-likelihoods over trials would give you leave-one-subject out, but there are often issues with loo for hierarchical models at the subject level.
If you haven’t looked at it already, there is a lot of useful information in the Cross-validation FAQ .
Thank you for this. I will try to dig deeper to understand whether it make sense to do subject-wise loo which will probably be easier to implement…