Equal Representation in Model Blocks

SHINICHI · September 25, 2021, 12:30pm

Please share your Stan program and accompanying data if possible.

In the following model, I want to represent the observed value y[t](N×1 vector) in a model block, but “=” is rejected.

The main point of the model can be simplified as
y[t] = D*ε[t].
ε[t] = A *e[t].
e[t] ~ N(0,1)
where D and A are N x N matrices.


MSV_code = """
data {
  int T;   // # Time points
  int N;   // # Number of Assets 
  vector[N] y[T];      // return matrix at time t
}

parameters {
  vector[N] h_std[T];  // log vol at time t
  vector[N] mu;
  vector<lower=0,upper=1>[N] phi;
  vector<lower = 0>[N] eta;

  vector[N*(N-1)/2] h_std_off[T];   // off_diag of Q at time t
  vector[N*(N-1)/2] mu_off;
  vector<lower=0,upper=1>[N*(N-1)/2] phi_off;
  vector<lower = 0>[N*(N-1)/2] eta_off;

  vector[N] h_std_d[T];  // diag of Q at time t
  vector[N] mu_d;
  vector<lower=0,upper=1>[N] phi_d;
  vector<lower = 0>[N] eta_d;
  
  vector[N] e[T];  // e at time t
}

transformed parameters {
  vector[N] h[T];
  vector[N*(N-1)/2] h_off[T];
  vector[N] h_d[T];
  vector[N] epsilon[T];

    for (t in 1:T){
    h[t] = h_std[t] .* eta ;
    }
    h[1] ./= sqrt(1 - phi .* phi);  // rescale h[1] as prior distribution
    h[1] += mu;
    
    for (t in 2:T){
    h[t] += mu +  phi .* (h[t - 1] - mu);
    }

    for (t in 1:T){
    h_off[t] = mu_off + h_std_off[t] .* eta_off;
    }
    h_off[1] ./= sqrt(1 - phi_off .* phi_off);
    for (t in 2:T){
    h_off[t] += phi_off .* (h_off[t - 1] -  mu_off);
    }
    
    for (t in 1:T){
    h_d[t] = h_std_d[t] .* eta_d ;
    }
    h_d[1] ./= sqrt(1 - phi_d .* phi_d);  // rescale h[1] as prior distribution
    h_d[1] += mu_d;
    
    for (t in 2:T){
    h_d[t] += mu_d +  phi_d .* (h_d[t - 1] - mu_d);
    }
    
  for (t in 1:T){
          epsilon[t][1] = e[t][1] * exp(h_d[t][1]/2);
      for (i in 2:N){
          epsilon[t][i] = sum(h_off[t][(i*i-3*i+4)/2:(i*i-i)/2] .* exp(h_d[t][1:i-1]/2) .* e[t][1:i-1]) + e[t][i] * exp(h_d[t][i]/2);
      }
    }

}

model {
  mu ~ normal(-10, 10);
  mu_off ~ normal(0, 10);
  mu_d ~ normal(0, 10);

  phi ~ uniform(0, 1);
  phi_off ~ uniform(0, 1);
  phi_d ~ uniform(0, 1);

  eta ~ inv_gamma(4, 1);
  eta_off ~ inv_gamma(4, 1);
  eta_d ~ inv_gamma(4, 1);
   
  for (t in 1:T){
    h_std[t] ~ std_normal();
    h_std_off[t] ~ std_normal();
    h_std_d[t] ~ std_normal();
    e[t] ~ std_normal();
    }

for (t in 1:T){
    y[t] = epsilon[t] .* exp(h[t]/2);
    }

}

"""

sm = pystan.StanModel(model_code=MSV_code)

I would like to use “=” to express y[t] without complications, but is there a different place to use it?

Thank you

martinmodrak · September 30, 2021, 12:27pm

Hi,
I think there is a misunderstanding about what Stan does. A Stan program is a function that computes the log density of the data, given the parameters. The sampling statements like mu ~ normal(-10,10) are just a shorthand for a target += normal_lupdf(mu | -10, 10) statement - they are just incrementing the target log density. Unlike JAGS, Stan is not doing any “magic” behind the scenes and doesn’t really “understand” the program in any way (this is what makes Stan fast and flexible).

This means that statements like “some_data = expression” are meaningless for Stan as they cannot be directly understood as an increment of the target log-density - the density is only implied. Luckily, models can usually be transformed to represent the densitites explicitly.

So when you have

y[t] = DA *e[t] \\ e[t] \sim N(0,1)

You can use the properties of normal distribution (e.g. this question on Stack exchange ), to determine the (multivariate normal) distribution of y[t] directly. You can then happily use the multi_normal distribution for y[t]

Does that make sense?

SHINICHI · October 6, 2021, 1:24pm

martinmodrak:

Hi,
I think there is a misunderstanding about what Stan does. A Stan program is a function that computes the log density of the data, given the parameters. The sampling statements like mu ~ normal(-10,10) are just a shorthand for a target += normal_lupdf(mu | -10, 10) statement - they are just incrementing the target log density. Unlike JAGS, Stan is not doing any “magic” behind the scenes and doesn’t really “understand” the program in any way (this is what makes Stan fast and flexible).

This means that statements like “some_data = expression” are meaningless for Stan as they cannot be directly understood as an increment of the target log-density - the density is only implied. Luckily, models can usually be transformed to represent the densitites explicitly.

So when you have

y[t] = DA *e[t] \ e[t] \sim N(0,1)

y[t]=DA∗e[t]e[t]∼N(0,1) y[t] = DA *e[t] \ e[t] \sim N(0,1)

You can use the properties of normal distribution (e.g. this question on Stack exchange ), to determine the (multivariate normal) distribution of y[t] y[t]y[t] directly. You can then happily use the multi_normal distribution for y[t]

Does that make sense?

Hello.
I’m sorry for the late reply.

I understand what you are saying very well. It certainly seems to me that I can rewrite it to conduct sampling with a linear transformation.

Thank you very much.

Topic		Replies	Views
Problem with block transformed data Modeling	1	907	December 26, 2017
How to declare a variable in model block Modeling	24	4537	August 12, 2018
Setting response not as a distribution but rather as an equality Modeling	2	251	October 5, 2023
Problem Simulating Data with Generated Quantities (Dimension mismatch in assignment; type = real; right-hand side type = real[ ]) RStan rstan , techniques , specification	3	860	April 29, 2021
Inspecting data fit of stan model/ problems with generated quantities code block Modeling rstan	4	500	October 27, 2022

Equal Representation in Model Blocks

Related topics