I’m using priorsense v1.0.3 and I’m getting an warning that is preventing me from getting a pareto-k for my powerscaling. I’m not sure how to find these problematic inputs.
> powerscale(fit, alpha = 0.9, component = "prior", variable = "log_lambda_gp_alpha")
# A draws_df: 300 iterations, 4 chains, and 1 variables
log_lambda_gp_alpha
1 0.83
2 1.88
3 1.26
4 1.24
5 1.78
6 2.00
7 1.50
8 1.24
9 1.29
10 1.42
# ... with 1190 more draws
# ... hidden reserved variables {'.log_weight', '.chain', '.iteration', '.draw'}
power-scaling
alpha: 0.9
scaled component: prior
pareto-k: NA
pareto-k threshold: NA
resampled: FALSE
transform: identity
Warning message:
Input contains infinite or NA values, is constant, or has constant tail. Fitting of generalized Pareto distribution not performed.
This is probably also related to this message I get with powerscale_sequence (or not!)
> powerscale_sequence(fit, component = "prior", variable = "log_lambda_gp_alpha")
Error in ord$x[tail_ids] : only 0's may be mixed with negative subscripts
In addition: Warning message:
Input contains infinite or NA values, is constant, or has constant tail. Fitting of generalized Pareto distribution not performed.
This looks like it results from posterior::pareto_smooth detecting issues with the weights vector preventing pareto_smoothing. Does powerscaling with an alpha value closer to 1 like 0.99 work?
Also, if you can share the model / posterior draws, I can take a closer look and investigate the issue. I’ve made a github issue, as NA should be handled better without an error in powerscale_sequence
Also, @karimn if it is not possible to share the model, perhaps just the lprior vector? As I think that would be enough for me to understand what is going on.
Apologies @n-kall, I wasn’t ghosting you, I was just making some changes to my code and haven’t yet been able to try out an alpha of 0.99.
I need to check if I can share code and posteriors since this is sensitive information. But I can share with you the lprior variable (scalar not vector, right?) after I rerun my model. I also noticed that in the motorcycle example log_lik is a scalar not a vector as in the example code on the priorsense website. Does any of this matter?
No worries, I just realised that I would only need the lprior draws and realised that might be easier to share. By vector I just mean that it is a numeric vector of length ndraws (1200 in your case) in R. This is then used for the importance sampling weight calculation, which is what is likely causing the issue. Regarding log_lik, if it is the pointwise log likelihood (defined as a vector in Stan), which is used by loo, it can be summed together to get the joint log likelihood which is used in priorsense. But otherwise it can be the scalar (real) joint log likelihood directly.
I realized I was missing a bunch of priors in lprior and made other fixes to my model and I’m no longer getting any errors. I did have one last question: in the motorcycles example the model has prior_alpha and likelihood_alpha inputs that are multiply the priors and likelihood, is that to do it manually if we have high pareto-k?
Great that it is now working! Although I still don’t think there should be such an error, so if you have time to reproduce the lprior causing the error, I would still be interested in checking it out to fix the underlying issue.
You are right that manual power scaling with refits can be done with the additional data variables in the Stan code.