Simulate data - quadratic relationship

Hi,

I would like to simulate a response by using rstan based on a model.

The model I would like to be based on is the following one

Generalized least squares fit by maximum likelihood
  Model: y ~ x + I(x^2)
  Data: model_data1 
       AIC      BIC    logLik
  2301.816 2315.627 -1145.908

Variance function:
 Structure: Exponential of variance covariate
 Formula: ~x
 Parameter estimates:
        expon 
-0.3136419 

Coefficients:
                  Value Std.Error   t-value p-value
(Intercept)    50679.87  754.6125  67.16012       0
x             -15846.97 1361.0995 -11.64277       0
I(x^2)          2545.30  423.7010   6.00730       0

 Correlation: 
              (Intr) Silicn
x            -0.702       
I(x^2)        0.561 -0.967

Standardized residuals:
        Min          Q1         Med          Q3         Max 
-2.05303619 -0.72835023 -0.01836311  0.83581099  2.25393542 

Residual standard error: 4339.779 
Degrees of freedom: 117 total; 114 residual

To do that I’m using the following model,

sim_data_TA <- 
'
data {
int N;
real alpha[N];
real beta[N];
real beta_sq[N];
real x[N];
real<lower=0> sigma[N];
}

parameters {}
model {}

generated quantities {
real y[N];
for (n in 1:N) {
y[n] = normal_rng(
  alpha[n]  + 
  beta[n] * x[n] + 
  beta_sq[n] * x[n], 
  sigma[n]);
}
}
'

compiled_TA <- stan_model(model_code = sim_data_TA)

  
N = 4
sim_TA <- sampling(
  compiled_TA, algorithm = 'Fixed_param',
  data = list(N = N,
                alpha = rep(50.680, each = N), 
                beta = rep(-15.847, each = N), 
                beta_sq = rep(2.545, each = N),
                sigma = rep(c(10, 9, 7, 5)), 
                
                x = rep(c(0, .5, 1.5, 3))
  ))

Then, I’m taking a big enough sample (e.g. first 100) and I plot them expecting to see a quadratic relationship between y ~ x.

fake_data_TA <- rstan::extract(sim_TA, pars = 'y') %>% as.data.frame %>% .[1:100, ]

fake_data_TA2 <- data.frame(
  y = melt(fake_data_TA)[, 'value'], 
  x = rep(c(0, .5, 1.5, 3), each = 100))

# x11()
fake_data_TA2 %>% 
  ggplot(aes(y = y, x = x)) + 
  geom_point() + 
  geom_smooth(method = lm, formula = y ~ x + I(x^2))

However, by plotting the simulated response I don’t see any quadratic relationship. Now, I am not sure what is going wrong.

In sort, how can I simulate quadratic relationship using rstan?

Hi @Elef.

A couple of points: your posted Stan code does not include a squared term for x (i.e. you need beta_sq * square(x[n]) in the model formula) and your parameters are varying by data case, whereas the model you are trying to simulate has, I believe, constant parameters (i.e. you want normal_rng(alpha + beta * x[n] + beta_sq * square(x[n]), sigma) as the linear predictor.

2 Likes