Question about custom probability density function get "Error : Initialization failed."

lixiaohui · October 9, 2023, 9:08am

There are 5 stocks, and based on historical data, the winning ratio of these 5 stocks ( the stock with the highest daily return is considered the winner) is 1:(1 - \beta):(1 - 2\beta):2\beta:\beta. The past 250 days of data shows a performance of (84,64,48,39,15). Based on this data, the task is to estimate posterior \beta, which prior distribution is uniform distribution between 0 and 0.5.

The specific approach to solve this problem is to define a custom probability density function, i.e.
X = (X_1,X_2,X_3,X_4,X_5) ~ \sim (\frac{1}{3})^{X_{1}}(\frac{1-\beta}{3})^{X_2}(\frac{1-2\beta}{3})^{X_2}(\frac{2\beta}{3})^{X_4}(\frac{\beta}{3})^{X_5}

Furthermore, in the coding, I have considered two methods:

Method 1: Treating the 250 days of data as 250 individual samples, the data is transformed as follows:Get the reasonable parameter estimation result I expected.

(1,0,0,0,0) , stock1,84 days
…
(0,1,0,0,0) ,stock2,64 days
…
(0,0,1,0,0), stock3, 48 days
…
(0,0,0,1,0) ,stock4, 39 days
…
(0,0,0,0,1), stock5, 15 days

The stand code is:

functions {
  real multinomialCustom_lpdf(real[] aX,real abeta){
   return(
     aX[1]*log(1.0/3) + 
     aX[2]*log((1-abeta)/3.0) +
     aX[3]*log((1-2*abeta)/3.0) +
     aX[4]*log(2*abeta/3.0) + 
     aX[5]*log(abeta/3.0)
     );
  }
}
data {
  int<lower=1> N;
  int<lower = 1> K;
  real<lower = 0>y[N,K];
}

parameters {
  real<lower=0,upper = 0.5> abeta;
}
model {
  for(i in 1:N){
    y[i] ~ multinomialCustom(abeta);
  }
  //y ~ multinomialCustom(abeta);
}

The corresponding R code is:

multi_custom <- function(y) {
  y_mat <- NULL
  k = length(y)
  N = sum(y)
  for(i in 1:k){
    a <- mat.or.vec(y[i],k)
    a[,i] <- 1
    y_mat <- rbind(y_mat,a)
  }
  y_mat <- y_mat[sample(1:N,size = N,replace = FALSE),]
  return(y_mat)
}
beta_fit1 <- stan(
  file = "multinomial.stan",
  data = list(N = 250,K = 5,y = multi_custom(y = c(84,64,48,39,15))  
),    # named list of data
  chains = 4,             # number of Markov chains
  warmup = 5000,          # number of warmup iterations per chain
  iter = 20000,            # total number of iterations per chain
  cores = 4,              # number of cores (could use one per chain)
  refresh = 1,          # no progress shown
 init_r = 0.25 )

Method 2: Considering the 250 days of data as only one sample,

(x_1,x_2,x_3,x_4,x_5) = (84,64,48,39,15)

The stan code:

functions {
  real multinomialCustom_lpdf(real[] aX,real abeta){
   return(
    aX[1]*log(1/3) + 
    aX[2]*log((1-abeta)/3) +
    aX[3]*log((1-2*abeta)/3) +
    aX[4]*log(2*abeta/3) + 
    aX[5]*log(abeta/3)
     );
  }
}
data {
  int<lower = 1> K;
  real<lower = 0> y[K];
}
parameters {
  real<lower = 0,upper = 0.5> abeta;
}
model {
  // prior
  abeta ~ uniform(0, 0.5);
  // likehood
  target += multinomialCustom_lpdf(y|abeta);
  target += log(2.0);
}

beta_fit2 <- stan(
  file = "multinomial1.stan",
  data = list(K = 5,y = c(84,64,48,39,15)),   
  chains = 4,             # number of Markov chains
  warmup = 5000,          # number of warmup iterations per chain
  iter = 20000,            # total number of iterations per chain
  cores = 4,              # number of cores (could use one per chain)
  refresh = 1          # no progress shown
)

However, I am confused because Method 2 is not successful and results in an error of

Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 4:   Stan can't start sampling from this initial value.
Chain 4: Initialization between (-2, 2) failed after 100 attempts. 
Chain 4:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error : Initialization failed."
error occurred during calling the sampler; sampling not done
here are whatever error messages were returned

Therefore, I would greatly appreciate any help or guidance.

Corey.Plate · October 9, 2023, 10:07pm

The problem seems like it could be here:

functions {
  real multinomialCustom_lpdf(real[] aX,real abeta){
   return(
     aX[1]*log(1.0/3) + 
     aX[2]*log((1-abeta)/3.0) +
     aX[3]*log((1-2*abeta)/3.0) +
     aX[4]*log(2*abeta/3.0) + 
     aX[5]*log(abeta/3.0)
     );
  }
}

parameters {
  real<lower=0,upper = 0.5> abeta;
}

abeta is bounded on the interval (0,0.5) in parameters, which is inclusive of 0 and 0.5. The above log likelihood will be unable to return defined values from aX[3]*log((1-2*abeta)/3.0) when abeta is 0.5, and from aX[5]*log(abeta/3.0) when abeta is 0. Both evaluate to log(0).

lixiaohui · October 10, 2023, 1:28am

Thanks，What Can I correct the code？

Corey.Plate · October 10, 2023, 1:57am

I would give the following a shot:

parameters {
  real<lower=0.01,upper = 0.49> abeta;
}

lixiaohui · October 10, 2023, 2:19am

It seems not work.

jsocolar · October 10, 2023, 2:21am

The sampler will never take the literal values 0 or 0.5 under this constraint (nor would it ever take any particular value if it had infinite precision, which is why it is irrelevant to worry about whether or not the constraint technically admits the literal boundary values).

The problem instead is some issue of numerical overflow or underflow in the logarithms. Here is the code rewritten for better numerical stability. Note that quite a few constant terms could still be dropped without affecting parameter estimation.

functions {
  real multinomialCustom_lpdf(real[] aX,real abeta){
   return(
    aX[1]*(log(1) - log(3)) + 
    aX[2]*(log1m(abeta) - log(3)) +
    aX[3]*(log1m(2*abeta) - log(3)) +
    aX[4]*(log(2) + log(abeta) - log(3)) + 
    aX[5]*(log(abeta) - log(3))
     );
  }
}
data {
  int<lower = 1> K;
  real<lower = 0> y[K];
}
parameters {
  real<lower = 0,upper = 0.5> abeta;
}
model {
  // prior
  abeta ~ uniform(0, 0.5);
  // likehood
  target += multinomialCustom_lpdf(y|abeta);
  target += log(2.0);
}

lixiaohui · October 10, 2023, 3:27am

Yeah，it works.Thanks。

Topic		Replies	Views
Custom weibull function Modeling rstan , fitting-issues	1	313	July 5, 2021
Coding a custom density with discontinuities Modeling	9	863	April 14, 2019
Initialization failure in rstan Modeling	1	538	May 15, 2020
Initialization Failed Modeling	1	327	May 10, 2021
Beta regression,Rejecting initial value,Error evaluating the log probability at the initial value Modeling fitting-issues	7	1434	April 16, 2019

Question about custom probability density function get "Error : Initialization failed."

Related topics