The problem in the hessian matrix in optimizing function

bistoon · November 2, 2017, 1:21pm

Dear all,
Hi

I think that the optimizing function in Stan is fast and better than optim function in R, but the hessian matrix results (and, of course, the standard deviation (sd) of the MLEs) are different.
If X~N(mu, sigma), we can obtain the exact value of MLEs and their sds. By running the attached file, we can see that optim and sampling (in Stan) functions have good sd, but the optimizing function results is unacceptable (of course, only for the sd of the sigma parameter).
Thank you for checking the file and specifying my mistake
.
Best,
-Bistoon

Andre_Pfeuffer · November 2, 2017, 2:30pm

set.seed(100)

n = 2000

x = rnorm(n, 0, 2)
loglike = function(x, par) -sum(log(dnorm(x,par[1],par[2])))

fit1 = optim(par=c(mean(x), sd(x)), loglike, x=x , hessian = TRUE)
fit1
$par
[1] 0.02095186 2.01174852

$value
[1] 4235.386

$counts
function gradient 
      35       NA 

$convergence
[1] 0

$message
NULL

$hessian
         [,1]     [,2]
[1,] 494.1771   0.0000
[2,]   0.0000 987.6151

fit2 = optimizing(mystanmodel, hessian = TRUE, data=list(n=n, x=x), init =  list(mu = mean(x), sigma = sd(x)))
Initial log joint probability = -2397.51
Optimization terminated normally: 
  Convergence detected: relative gradient magnitude is below tolerance
fit2
$par
        mu      sigma 
0.02095186 2.01124552 

$value
[1] -2397.508

$return_code
[1] 0

$hessian
                 mu         sigma
mu    -4.944243e+02  2.247603e-11
sigma  2.247603e-11 -4.000003e+03

I’m not a Stan developer. I don’t see very much difference in parameter estimate in mean and sd of fit1 and fit2, the hessian estimate differs in sigma/sigma. Is it that optimization is calculated at the unconstrained parameter space?

See: " hessian: ‘TRUE’ or ‘FALSE’ (the default): flag indicating whether to
calculate the Hessian (via numeric differentiation of the
gradient function in the unconstrained parameter space)."

bistoon · November 2, 2017, 4:43pm

sd_in_optimizing_3.r (1.5 KB)

in optim: fit1$hessian[2,2] = 987.6151
in optimizing: fit2$hessian[2,2] = -4000

I changed fit1 as follows and attached new file here:

optim function in R (stats)

loglike = function(x, par) -sum(log(dnorm(x,par[1],par[2])))
lower = c(-Inf, 0)
upper = c(Inf, Inf)
fit1 = optim(par=c(mean(x), sd(x)), loglike, x=x , hessian = TRUE, method = “L-BFGS-B”, lower=lower, upper=upper)

Here, I think fit1 and fit2 have similar parameter space and optimizing method. The optim function seems to be a good result (in sd) and I want to get the same results with the optimizing function, but I have not been able to do this yet.

Bob_Carpenter · November 6, 2017, 12:13am

The Hessian’s being computed on the unconstrained scale so that’d be log(sigma).

The accuracy with finite differences of gradients isn’t great, but it should be in the ballpark to several decimal places.

Bob_Carpenter · November 6, 2017, 12:15am

Here’s the doc for optimizing option hessian:

hessian: ‘TRUE’ or ‘FALSE’ (the default): flag indicating whether to
          calculate the Hessian (via numeric differentiation of the
          gradient function in the unconstrained parameter space).

Andre_Pfeuffer · November 6, 2017, 3:56am

Bistoon has a point, the hessian looks not ok.

in optim: fit1$hessian[2,2] = 987.6151
in optimizing: fit2$hessian[2,2] = -4000

Exactly in this entry [2,2] of the matrix and the “end”-element too.
Bob, could it be some for(0 … (n-1)) or for(1 … n) coping thing?

bgoodri · November 6, 2017, 5:26am

No, it is an unconstrained parameterization thing.

bistoon · November 6, 2017, 6:29am

I’ve taken my answer in this topic (Optimizing function: incorrect estimation of the standard deviation of MLE). This code is useful (thanks Dr. Goodrich):

fit2 = optimizing(Stan_Normal,hessian=TRUE,data=list(n=n,x=x), init=list(mu=mean(x),sigma=sd(x)),draws=20000)
optimizing_mle = fit2$par;
optimizing_sd_mle = sqrt(diag(cov(fit2$theta_tilde)))
optimizing_sd_mle

-Bistoon

Topic		Replies	Views
Problem recovering standard errors for parameters when fiting with optimizing algorithm Modeling rstan , optimization	0	512	April 1, 2023
Optimizing function: incorrect estimation of the standard deviation of MLE General	6	1926	November 3, 2017
Hessian matrix for transformed parameters in optimizing function RStan optimization	1	1021	April 30, 2020
Output Hessian for Optimization in Mathematica Stan Interfaces	2	679	January 9, 2018
Access Hessian from optimize method in CmdStanR CmdStan cmdstanr	19	1839	May 16, 2024

The problem in the hessian matrix in optimizing function

optim function in R (stats)

Related topics