Hi,
I would like to simulate a response by using rstan based on a model.
The model I would like to be based on is the following one
Generalized least squares fit by maximum likelihood
Model: y ~ x + I(x^2)
Data: model_data1
AIC BIC logLik
2301.816 2315.627 -1145.908
Variance function:
Structure: Exponential of variance covariate
Formula: ~x
Parameter estimates:
expon
-0.3136419
Coefficients:
Value Std.Error t-value p-value
(Intercept) 50679.87 754.6125 67.16012 0
x -15846.97 1361.0995 -11.64277 0
I(x^2) 2545.30 423.7010 6.00730 0
Correlation:
(Intr) Silicn
x -0.702
I(x^2) 0.561 -0.967
Standardized residuals:
Min Q1 Med Q3 Max
-2.05303619 -0.72835023 -0.01836311 0.83581099 2.25393542
Residual standard error: 4339.779
Degrees of freedom: 117 total; 114 residual
To do that I’m using the following model,
sim_data_TA <-
'
data {
int N;
real alpha[N];
real beta[N];
real beta_sq[N];
real x[N];
real<lower=0> sigma[N];
}
parameters {}
model {}
generated quantities {
real y[N];
for (n in 1:N) {
y[n] = normal_rng(
alpha[n] +
beta[n] * x[n] +
beta_sq[n] * x[n],
sigma[n]);
}
}
'
compiled_TA <- stan_model(model_code = sim_data_TA)
N = 4
sim_TA <- sampling(
compiled_TA, algorithm = 'Fixed_param',
data = list(N = N,
alpha = rep(50.680, each = N),
beta = rep(-15.847, each = N),
beta_sq = rep(2.545, each = N),
sigma = rep(c(10, 9, 7, 5)),
x = rep(c(0, .5, 1.5, 3))
))
Then, I’m taking a big enough sample (e.g. first 100) and I plot them expecting to see a quadratic relationship between y ~ x.
fake_data_TA <- rstan::extract(sim_TA, pars = 'y') %>% as.data.frame %>% .[1:100, ]
fake_data_TA2 <- data.frame(
y = melt(fake_data_TA)[, 'value'],
x = rep(c(0, .5, 1.5, 3), each = 100))
# x11()
fake_data_TA2 %>%
ggplot(aes(y = y, x = x)) +
geom_point() +
geom_smooth(method = lm, formula = y ~ x + I(x^2))
However, by plotting the simulated response I don’t see any quadratic relationship. Now, I am not sure what is going wrong.
In sort, how can I simulate quadratic relationship using rstan?