# Custom Likelihood for Fitting Markov Model

Hi

I am trying to fit a multistate model using Bayesian methods.

Most of the data are for exactly observed transition times. The likelihoods for various observation schemes are on page 9 of https://cran.r-project.org/web/packages/msm/vignettes/msm-manual.pdf.

I have coded some of these likelihoods into the function MarkovLH.

Am I properly using these likelihoods in the code below with the target += line?

``````functions{
real MarkovLH(int LHtype, real t1, real t2, real t3, real lam1, real lam2, real lam3){
real L;
real logL;

if (LHtype==1){
L=(exp((lam1+lam2)*t1)*lam1)*(exp((lam3)*t2)*lam3);
} else if (LHtype==2){
L=exp((lam3)*t1)*lam3;
}

logL=log(L);
return logL;
}
}
data {
int <lower=0> N;
int <lower=0> LHtype;
vector[N] grp;
vector[N] t1;
vector[N] t2;
vector[N] t3;
}
parameters {
real<lower=0> lambda [3];
real bet [3];
}

model {
for (i in 1:N){
target += MarkovLH(LHtype[i], t1[i], t2[i], t3[i], lambda[1]*exp(bet[1]*grp[i]), lambda[2]*exp(bet[2]*grp[i]), lambda[3]*exp(bet[3]*grp[i]) );
}

lambda ~ gamma(0.001, 0.001);
bet ~ normal(0, 100);

}
``````
2 Likes

I think that syntax is legal but you should not compute a likelihood function by computing the likelihood function and then taking its logarithm. Simplify the log-likelihood function analytically, and then implement it, as in

``````functions {
real MarkovLH(int LHtype, real t1, real t2, real t3,
real lam1, real lam2, real lam3) {
return LHtype == 1 ?
(lam1 + lam2) * t1 + log(lam1) + lam3 * t2 + log(lam3) :
lam3 * t1 + log(lam3);
}
}
``````

Also, donâ€™t use a gamma(0.001, 0.001) prior ever and even the normal(0, 100) is dubious.

Thank you.

Does the STAN not work properly if run with logs the way I had at first? Or is it just inefficient?

Liable to overflow

Thank you. I got everything to work with the code below, and it agrees with MLE using (https://cran.r-project.org/web/packages/msm/index.html). I simplified the pdf.

I am using the priors recommended by (https://www.ncbi.nlm.nih.gov/pubmed/20963750). They recommended normal(0, 10). Can you recommend something less suspect?

``` functions{ real MarkovKnownlpdf(real t, real [] lam, real censind, int state, int trans){ real lamstay [3]; real LL; lamstay[1]=-(lam[1]+lam[2]); lamstay[2]=-lam[3]; LL = (lamstay[state])*t + log(lam[trans])*censind; return LL; } real MarkovUnknownlpdf(real t, real [] lam){ real LL; LL = (-(lam[1] + lam[2]))*t + log(lam[2] + ((-1 + exp((lam[1] + lam[2] - lam[3])*t))*lam[1]*lam[3])/(lam[1] + lam[2] - lam[3])); return LL; } } data { int <lower=0> Nknown; int <lower=0> Nunknown; vector [Nknown] tknown; vector [Nunknown] tunknown; int <lower=0> stateknown [Nknown]; int <lower=0> transknown [Nknown]; vector [Nknown] censindknown; vector [Nknown] grpknown; vector [Nunknown] grpunknown; } parameters { real llam [3]; real bet [3]; } transformed parameters { real lam [3]; for (k in 1:3){lam[k]=exp(llam[k]);} } model { real tempknown [3]; real tempunknown [3]; for (i in 1:Nknown){ tempknown[1]=lam[1]*exp(bet[1]*grpknown[i]); tempknown[2]=lam[2]*exp(bet[2]*grpknown[i]); tempknown[3]=lam[3]*exp(bet[3]*grpknown[i]); target += MarkovKnownlpdf(tknown[i], tempknown, censindknown[i], stateknown[i], transknown[i]); } for (j in 1:Nunknown){ tempunknown[1]=lam[1]*exp(bet[1]*grpunknown[j]); tempunknown[2]=lam[2]*exp(bet[2]*grpunknown[j]); tempunknown[3]=lam[3]*exp(bet[3]*grpunknown[j]); target += MarkovUnknownlpdf(tunknown[j], tempunknown); } llam ~ normal(0, 10); bet ~ normal(0, 10); } ```

The `_lpdf` notation does not change the draws (aside from some numerical instability). You have to change the model to make it more amenable to NUTS estimation.

The optimization agrees with the MLE? Because usually Bayesian posterior means arenâ€™t exactly the same as the MLE.

So that doesnâ€™t sound like an MLEâ€”penalized MLE maybe?

We recommend priors that are at least weakly informative in specifying the order of magnitude of the posterior and its tail behavior. Why is `normal(0, 10)` suspect here?

``````real LL;
LL = (-(lam[1] + lam[2]))*t + log(lam[2] + ((-1 + exp((lam[1] + lam[2] - lam[3])*t))*lam[1]*lam[3])/(lam[1] + lam[2] - lam[3]));
return LL;
``````

can just be a one-liner with the return.

You probably ant to be using log-sum-exp if youâ€™re combining logs and exponents.

``````real lam [3];
for (k in 1:3){lam[k]=exp(llam[k]);}
``````

can be coded as

``````real lam[3] = exp(llam);
``````

But you probably donâ€™t want to be moving off the log scale so soon.

Indentation would help, too.

The variable `lamstay [3]` is declared with three arguments, but only two are defined and then only one is ever used (the one corresponding to `state`).

A lot of the rest can be done with matrix operations and more vectorization, like all those tempknown definitions, e.g.

``````tempknown = lam .* exp(beta * grpknown[i]);
``````

bug again, I wouldnâ€™t be jumping off the log scale unless necessary here. Some of these things youâ€™re adding and then exponentiationg and then logging again. For arithmetic stability, quantities need to stay on the log scale as long as possible.

Expressions like `lam[1] + lam[2] - lam[3]` should be computed once and stored rather than being recomputed multiple times in a function.