Hello, I am having trouble generating posterior predictions using posterior_survfit(). I am trying to use a new data frame, but it is not using the new data frame and instead is using values from the dataset I used to fit the model. The fitted variables in the model are New.Treatment (6 treatments = categorical), Openness (a continuous light index min= 2.22, mean= 6.903221 and max=10.54), subplot_by_site(categorical-720 sites), New.Species.name(categorical- 165 species). My new data frame has 94 rows and the posterior_survfit() is giving me 3017800 rows. Help, please!
head(nd)
New.Treatment Openness
1 BE 5
2 BE 6
3 BE 7
4 BE 8
5 BE 9
6 BE 10
fit= stan_surv(formula = Surv(days, Status_surv) ~ New.Treatment*Openness + (1 |subplot_by_site)+(1|New.Species.name),
data = dataset,
basehaz = "weibull",
chains=4,
iter = 2000,
cores =4 )
Post=posterior_survfit(fit, type="surv",
newdata=nd5)
head(Post)
id cond_time time median ci_lb ci_ub
1 1 NA 62.0000 0.9626 0.9623 1.0000
2 1 NA 69.1313 0.9603 0.9600 0.9997
3 1 NA 76.2626 0.9581 0.9579 0.9696
4 1 NA 83.3939 0.9561 0.9557 0.9665
5 1 NA 90.5253 0.9541 0.9537 0.9545
6 1 NA 97.6566 0.9522 0.9517 0.9526
#here some sample data set to replicate my problem
library(rstanarm)
data_NHN<- expand.grid(New.Treatment = c("A","B","C"), Openness = c(seq(2, 11, by=0.15)))
data_NHN$subplot_by_site=c(rep("P1",63),rep("P2",60),rep("P3",60))
data_NHN$Status_surv=sample(0:1,183, replace=TRUE)
data_NHN$New.Species.name=c(rep("sp1",10),rep("sp2",40),rep("sp1",80),rep("sp2",20),rep("sp1",33))
data_NHN$days=sample(10, size = nrow(data_NHN), replace = TRUE)
nd_t<- expand.grid(New.Treatment = c("A","B","C"), Openness = c(seq(2, 11, by=1)))
mod= stan_surv(formula = Surv(days, Status_surv) ~ New.Treatment+Openness + (1 |subplot_by_site)+(1|New.Species.name),
data =data_NHN,
basehaz = "weibull",
chains=4,
iter = 30,
cores =4)
summary(mod)
pos=posterior_survfit(mod, type="surv",
newdataEvent=nd_t,
times = 0)
head(pos)
# I am interested in predicting values for specific Openess values (nd_t=20 rows)but I am getting instead values for each point in time (pos=18300rows) .
- Operating System: Mac OS Catalina 10.15.6
- R version: 4.0
- rstan version: 2.21.2
- rstanarm Version: rstanarm_2.21.2
Any suggestions on why is it not working. it’s not clear how to give some sort of plot of the effects of one variable in the interaction as the other changes and the associated uncertainty (i.e. a marginal effects plot). TIA.