There are 5 stocks, and based on historical data, the winning ratio of these 5 stocks ( the stock with the highest daily return is considered the winner) is **1:(1 - \beta):(1 - 2\beta):2\beta:\beta**. The past 250 days of data shows a performance of **(84,64,48,39,15)**. Based on this data, the task is to estimate posterior \beta, which prior distribution is uniform distribution between 0 and 0.5.

The specific approach to solve this problem is to define a custom probability density function, i.e.

X = (X_1,X_2,X_3,X_4,X_5) ~ \sim (\frac{1}{3})^{X_{1}}(\frac{1-\beta}{3})^{X_2}(\frac{1-2\beta}{3})^{X_2}(\frac{2\beta}{3})^{X_4}(\frac{\beta}{3})^{X_5}

Furthermore, in the coding, I have considered two methods:

*Method 1*: Treating the 250 days of data as 250 individual samples, the data is transformed as follows:**Get the reasonable parameter estimation result I expected**.

(1,0,0,0,0) , *stock1,84 days*

…

(0,1,0,0,0) ,*stock2,64 days*

…

(0,0,1,0,0), *stock3, 48 days*

…

(0,0,0,1,0) ,*stock4, 39 days*

…

(0,0,0,0,1), *stock5, 15 days*

The stand code is:

```
functions {
real multinomialCustom_lpdf(real[] aX,real abeta){
return(
aX[1]*log(1.0/3) +
aX[2]*log((1-abeta)/3.0) +
aX[3]*log((1-2*abeta)/3.0) +
aX[4]*log(2*abeta/3.0) +
aX[5]*log(abeta/3.0)
);
}
}
data {
int<lower=1> N;
int<lower = 1> K;
real<lower = 0>y[N,K];
}
parameters {
real<lower=0,upper = 0.5> abeta;
}
model {
for(i in 1:N){
y[i] ~ multinomialCustom(abeta);
}
//y ~ multinomialCustom(abeta);
}
```

The corresponding R code is:

```
multi_custom <- function(y) {
y_mat <- NULL
k = length(y)
N = sum(y)
for(i in 1:k){
a <- mat.or.vec(y[i],k)
a[,i] <- 1
y_mat <- rbind(y_mat,a)
}
y_mat <- y_mat[sample(1:N,size = N,replace = FALSE),]
return(y_mat)
}
beta_fit1 <- stan(
file = "multinomial.stan",
data = list(N = 250,K = 5,y = multi_custom(y = c(84,64,48,39,15))
), # named list of data
chains = 4, # number of Markov chains
warmup = 5000, # number of warmup iterations per chain
iter = 20000, # total number of iterations per chain
cores = 4, # number of cores (could use one per chain)
refresh = 1, # no progress shown
init_r = 0.25 )
```

Method 2: Considering the 250 days of data as **only one sample**,

(x_1,x_2,x_3,x_4,x_5) = (84,64,48,39,15)

The stan code:

```
functions {
real multinomialCustom_lpdf(real[] aX,real abeta){
return(
aX[1]*log(1/3) +
aX[2]*log((1-abeta)/3) +
aX[3]*log((1-2*abeta)/3) +
aX[4]*log(2*abeta/3) +
aX[5]*log(abeta/3)
);
}
}
data {
int<lower = 1> K;
real<lower = 0> y[K];
}
parameters {
real<lower = 0,upper = 0.5> abeta;
}
model {
// prior
abeta ~ uniform(0, 0.5);
// likehood
target += multinomialCustom_lpdf(y|abeta);
target += log(2.0);
}
```

```
beta_fit2 <- stan(
file = "multinomial1.stan",
data = list(K = 5,y = c(84,64,48,39,15)),
chains = 4, # number of Markov chains
warmup = 5000, # number of warmup iterations per chain
iter = 20000, # total number of iterations per chain
cores = 4, # number of cores (could use one per chain)
refresh = 1 # no progress shown
)
```

However, I am confused because Method 2 is not successful and results in an error of

```
Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
Chain 4: Stan can't start sampling from this initial value.
Chain 4: Initialization between (-2, 2) failed after 100 attempts.
Chain 4: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error : Initialization failed."
error occurred during calling the sampler; sampling not done
here are whatever error messages were returned
```

Therefore, I would greatly appreciate any help or guidance.