# Model comparison with Maximum-Likelihood fits

I’m interested in using rstan to do maximum-likelihood estimation on some models, and using AIC to compare them to simpler models, such as those fit with R’s `glm`. I have a few interrelated questions. To make things concrete, I’ve fit the following simplified example.

``````data {
int <lower=1> N;
vector[N] x;
}

parameters {
real mu; // Mean
}

model {
x ~ normal(mu, 1);
}
generated quantities{
real ll;
ll = normal_lpdf(x | mu, 1);
}
``````
``````library(rstan)

x = rnorm(100, 10, 1)

mu.hat = mean(x)
sum(dnorm(x, mean=mu.hat, sd=1, log=T)) # Explicit log-likelihood
# -137.9223

stanm = stanmodel('gauss_mean_known_var.stan')
stan_opt = optimizing(stanm, data=list(N=100, x=x))
stan_opt

# \$par
#           mu          ll
# 9.975341 -137.922315
#
# \$value
# [1] -46.02846
#
# \$return_code
# [1] 0
#
# \$theta_tilde
#                 mu        ll
# [1,] 9.975341 -137.9223
``````
1. In the absence of explicit prior statements (e.g. `mu ~ normal(0, 10)`), does `rstan::optimizing` find the maximum-likelihood solution, or are implicit priors imposed here? According to the docs, “the mode is calculated without the Jacobian adjustment for constrained variables, which shifts the mode due to the change of variables. Thus modes correspond to modes of the model as written”, but I’m not sure if that means the same thing.
2. Is the explicitly-calculated log-likelihood (`ll` in my model) directly comparable to that in other packages, such as base R? It matches exactly in this case, but does it in general?
3. What is the relationship between the `value` output (corresponding to `lp__` when using `sampling`) and the log-likelihood? Is it just log-likelihood plus some constant due to the priors?
4. In this context, would I obtain different values if I replaced the sampling statement
`x ~ normal(mu, 1);` with the explicit increment `target += normal_logpdf(x | mu, 1);`?

Thanks.

1. optimizing finds the ml solution.
2. yes, the ml is comparable, but see 4.
3. the log prob is the log likelihood + log priors
4. yes, for comparable fit values to (most) other software, you need to use target+
2 Likes