# Initialisation error

I am new to using Stan and I am trying to model a simple ancestry problem but keep encountering an initialisation error. My model and code is provided below.

``````data {
int<lower = 1> K;
int<lower = 1> N;
matrix[K,N] beta;
vector[N] x;
}

parameters {
vector[K] f;
}

transformed parameters {
vector[N] alpha;
for(i in 1:N){
alpha[i]=0;
for(j in 1:K){
alpha[i] = alpha[i] + f[j]*beta[j,i];
}
}
}
model {
f~dirichlet(rep_vector(1, K));
x~dirichlet(alpha);
}

``````

The above runs fine with no errors. However, below produces the error :
 “Error in sampler\$call_sampler(args_list[[i]]) : Initialization failed.”
 “error occurred during calling the sampler; sampling not done”

``````# Firstly, prepare the data for stan

beta = matrix(
c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0.5, 0.5, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0.3, 0.4, 0.3),nrow=3,ncol=10,byrow=T)+0.05
beta=beta/rowSums(beta)

# This is the actual frequencies of the populations
ftrue=c(0.5,0.5,0)

# This function produces a sample child, based on the parent pop profiles
# and the true frequencies of the populations, sampled from multinomial
# distribution.
make_population_data<-function(beta,ftrue,N=100){
xi=rmultinom(1,N,ftrue)[,1]
x=rowSums(do.call("cbind",sapply(1:length(xi),function(i){
rmultinom(xi[i],1,beta[i,])
})))
list(K=dim(beta),N=dim(beta),beta=beta,x=x)
}

# Here we assign the output of the above function to the pop data we
# will be feeding into our stan model.
population_data=make_population_data(beta,ftrue)

# Next, we need to call stan function to draw posterior
# samples

#my_file <- file.path("C:", "Users", "Joach", "Desktop", "my_file.csv")
#"C:\Users\Team Knowhow\Documents\YEAR 4\Project\STAN models\01-script.R"
# use forward slashes in file path otherwise leads to an error
library(rstan)
fit1 <- stan(
file = "C:/Users/Team Knowhow/Documents/YEAR 4/Project/STAN models/01-model.stan",  # Stan program
data = population_data,  # named list of data
chains = 4,              # number of Markov chains
warmup = 1000,           # number of warmup iterations per chain
iter = 2000,             # total number of iterations per chain
cores = 1,               # number of cores (could use one per chain)
refresh = 0              # no progress shown
)
``````

``````simplex[K] f;
``````

For a dirichlet

Unfortunately, after changing the parameter as you said I am receiving the same error.

``````data {
int<lower = 1> K;
int<lower = 1> N;
matrix[K,N] beta;
vector[N] x;
}

// The parameters accepted by the model. Our model
// accepts one parameters 'f' which is a vector representing
// the mixture probabilities of each ref pop.
parameters {
simplex[K] f;
}

transformed parameters {
vector[N] alpha;
for(i in 1:N){
alpha[i]=0;
for(j in 1:K){
alpha[i] = alpha[i] + f[j]*beta[j,i];
}
}
}

// The model to be estimated. We model the output
// 'f' (mixture probability) to be dirichlet distributed
// with parameter (1,1,1), with length p (=3 in this case).
// x is dirichlet distributed with parameter alpha
model {
f~dirichlet(rep_vector(1, K));
x~dirichlet(alpha);
}
``````

What is the error shown if you remove the `refresh = 0` from your `stan` call?

Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: dirichlet_lpdf: probabilities is not a valid simplex. sum(probabilities) = 100, but should be 1 (in ‘string’, line 43, column 2 to column 21)
.
.
.
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1: Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
 “Error in sampler\$call_sampler(args_list[[i]]) : Initialization failed.”
 “error occurred during calling the sampler; sampling not done”

That’s referring to the input data `x`, you need to double check that it is a vector that sums to 1 (i.e., a simplex) to use the dirichlet prior