# Simulate data - quadratic relationship

Hi,

I would like to simulate a response by using rstan based on a model.

The model I would like to be based on is the following one

``````Generalized least squares fit by maximum likelihood
Model: y ~ x + I(x^2)
Data: model_data1
AIC      BIC    logLik
2301.816 2315.627 -1145.908

Variance function:
Structure: Exponential of variance covariate
Formula: ~x
Parameter estimates:
expon
-0.3136419

Coefficients:
Value Std.Error   t-value p-value
(Intercept)    50679.87  754.6125  67.16012       0
x             -15846.97 1361.0995 -11.64277       0
I(x^2)          2545.30  423.7010   6.00730       0

Correlation:
(Intr) Silicn
x            -0.702
I(x^2)        0.561 -0.967

Standardized residuals:
Min          Q1         Med          Q3         Max
-2.05303619 -0.72835023 -0.01836311  0.83581099  2.25393542

Residual standard error: 4339.779
Degrees of freedom: 117 total; 114 residual
``````

To do that I’m using the following model,

``````sim_data_TA <-
'
data {
int N;
real alpha[N];
real beta[N];
real beta_sq[N];
real x[N];
real<lower=0> sigma[N];
}

parameters {}
model {}

generated quantities {
real y[N];
for (n in 1:N) {
y[n] = normal_rng(
alpha[n]  +
beta[n] * x[n] +
beta_sq[n] * x[n],
sigma[n]);
}
}
'

compiled_TA <- stan_model(model_code = sim_data_TA)

N = 4
sim_TA <- sampling(
compiled_TA, algorithm = 'Fixed_param',
data = list(N = N,
alpha = rep(50.680, each = N),
beta = rep(-15.847, each = N),
beta_sq = rep(2.545, each = N),
sigma = rep(c(10, 9, 7, 5)),

x = rep(c(0, .5, 1.5, 3))
))
``````

Then, I’m taking a big enough sample (e.g. first 100) and I plot them expecting to see a quadratic relationship between y ~ x.

``````fake_data_TA <- rstan::extract(sim_TA, pars = 'y') %>% as.data.frame %>% .[1:100, ]

fake_data_TA2 <- data.frame(
y = melt(fake_data_TA)[, 'value'],
x = rep(c(0, .5, 1.5, 3), each = 100))

# x11()
fake_data_TA2 %>%
ggplot(aes(y = y, x = x)) +
geom_point() +
geom_smooth(method = lm, formula = y ~ x + I(x^2))
``````

However, by plotting the simulated response I don’t see any quadratic relationship. Now, I am not sure what is going wrong.

In sort, how can I simulate quadratic relationship using rstan?

Hi @Elef.

A couple of points: your posted Stan code does not include a squared term for `x` (i.e. you need `beta_sq * square(x[n])` in the model formula) and your parameters are varying by data case, whereas the model you are trying to simulate has, I believe, constant parameters (i.e. you want `normal_rng(alpha + beta * x[n] + beta_sq * square(x[n]), sigma)` as the linear predictor.

2 Likes