Hello stan forum,
I’ve manage to make my own loglikelihood function within stan, it works (at least it can be sampled from), but it is super slow (Gradient evaluation takes 0.122 seconds) - managed to bring it down from 15 seconds so its an improvement !
The model which I have/ trying to implement has the following log-likelihood :
-0.5\cdot (2a+N)\cdot log(2b+y^T(I_{N}+D \Sigma D^T)^{-1}y)
- Where a and b are set as very small values (10^{-5})
- y is the observed data
- I_{N} is the identity matrix of N points
- \Sigma is a matrix defined as \Sigma=g(D^{T}D)^{-1}, where g is set at a high number 10^5
- D is an [N\times 2\cdot K] matrix where every odd column is given as cos(f_ {j}\cdot N_{i})\cdot exp(-\alpha_j\cdot t_i) and every even column is given as sin(f_ {j}\cdot t_{i})\cdot exp(-\alpha_j\cdot t_i) with t being time, f being a frequency parameter and \alpha being a decay rate parameter. The goal is to estimate f and \alpha from the log-likelihood. As of right now j is only equal to 3 - but should in the future be able to handle a lot more.
Does anyone have an idea of how to make the code faster ?:)
The stan code for the estimation of f is shown below
functions {
real freq_dist_log(vector y,vector time, vector f,vector alpha, int k,int N,
real delta, real noise_prior_alpha, real noise_prior_beta){
// set up local values
matrix[N,2*k] D_damp; // D-matrix with dampning
matrix[2*k,2*k] D_damp_ch; // cholesky D-matrix with dampning
matrix[2*k,N] Sigma; // Sigma cov-matrix
matrix[N,N] I_mat; // identity matrix
matrix[N,N] mat1; // matrix made for easier code reading
matrix[N,N] L;
vector[N] A;
real prob; // log-liklihood
for (j in 1:k){
for (i in 1:N){
D[i,j*2-1]=cos(f[j]*time[i])*exp(-alpha[j]*time[i]);
D[i,j*2]=sin(f[j]*time[i])*exp(-alpha[j]*time[i]);
}
}
// compute Sigma from cholesky decompostion of D
D_ch=cholesky_decompose(crossprod(D));
Sigma=delta*D_ch\(D');
// set up likelihood function from Sigma
I_mat=diag_matrix(rep_vector(1,N)); // identity matrix
mat1=(I_mat+crossprod(Sigma)); // corresponds to (I_N+D*Sigma*D^T)
L=cholesky_decompose(mat1); // Cholesky decomp of mat1
A=((L))\y; // solve for A
// log-likelihood
prob=-0.5*(2*noise_prior_alpha+N)*log(2*noise_prior_beta+A'*A); // return the loglik
return prob;
}
}
data {
int<lower=0> N; // number of points
vector[N] y; // observations
vector[N] time; // time observations
int <lower=0> k; // no of components
// priors
real<lower=0> noise_prior_alpha; //
real<lower=0> noise_prior_beta; //
vector[k] alpha_prior;
real<lower=0> delta;
}
parameters {
positive_ordered [k] f ;
vector <lower=0> [k] alpha;
}
model {
// priors
f~uniform(0,pi());
y~freq_dist(time,f,alpha,k,N,delta,noise_prior_alpha,noise_prior_beta);
}
The R code to generate the data is given below, note that the stan model to call is just called stan_model.
# data and model params
N=500 # points
dt=0.1 # dwell time
t_max=(N-1)
# time points
j_idx=seq(0,t_max,length.out = N)
time=j_idx
# amplitudes & phase angles
beta1=c(1/sqrt(2),0.5,3)
beta2=c(1/sqrt(2),2,10)
A_1=sqrt(beta1[1]^2+beta2[1]^2) # real amp
A_2=sqrt(beta1[2]^2+beta2[2]^2) # real amp
A_3=sqrt(beta1[3]^2+beta2[3]^2) # real amp
A_vect=c(A_1,A_2,A_3)
phi_1=atan2(-beta2[1],beta1[1]) # phase angle
phi_2=atan2(-beta2[2],beta1[2]) # phase angle
phi_3=atan2(-beta2[3],beta1[3]) # phase angle
# frequcenies and decay rates
omega_vect=c(0.2,0.8,2)
alpha_vect=c(0.001,0.03,0.06)
# simulate data from parameters
k=length(omega_vect) # number of components
y_l=list()
for (i in 1:k){
y_l[i]=list(exp(-alpha_vect[i]*time)*A_vect[i]*cos(omega_vect[i]*time-phi_vect[i]))
}
y_data.frame=do.call("cbind",y_l)
y_sum=rowSums(y_data.frame)
# add noise to generated data
noise=0.01
noise_mean=0
y_noise=y_sum+rnorm(length(y_sum),noise_mean,sqrt(noise))
# priors of hyper params
exp_prior=rep(2,k) # log-decay prior from exp
noise_prior.alpha=10^-5 # shape noise from inv.gam
noise_prior.beta=10^-5 # scale noise from inv.gam
# initial values of all parameters
init_fun=function(){
list(
f= c(0.199,0.800,0.999 ) ,
alpha=c(rep(0.001,k)
)
}
# collect to a list of data input
dat=list(N = length(y_noise),
time=c(t(time)),
k=k,
delta=c(t(10^5)),
alpha_prior=c(t(exp_prior)),
noise_prior_alpha=c(t(noise_prior.alpha)),
noise_prior_beta=c(t(noise_prior.beta)),
y = c(t((y_noise))))
# Run the stan model
fit1=stan(file = 'stan_model.stan',init =init_fun, algorithm = "NUTS", chains = 1, data = dat,iter=5000)