Gaussian MLE with constrained region


I am having some trouble fitting a simple Gaussian model

X \sim \mathcal{N}(\mu,\,\sigma^{2})

with known mean \mu but unknown \sigma but subject to the constraint that

P(x_1 \leq X \leq x_2) = p

for some known values of x_1, x_2 and p.

In reality I am also estimating the parameters \mu, x_1, x_2 and p from data inside the model but that isn’t important.

A minimal example of my workings so far for finding \sigma subject to the above constraints would be:

data {
  real p;
  real mu;
  real x1;
  real x2;

parameters {
  real<lower=0> sigma;

transformed parameters {
  real<lower = 0, upper = 1> q = normal_cdf(x2| mu, sigma) - normal_cdf(x1| mu, sigma);
  real squared_diff = square(q-p);

model {
  sigma ~ exponential(1/mu);
  target += -squared_diff;

with some example input data:

model_dat <- list(p = 0.03, mu = 6.3, x1 = -0.5, x2 = 0.5)

The model finishes successfully but does not find a valid solution.

I know a solution (\sigma = 11.41777) exists for these inputs because it can be found via optimisation.

ssd <- function(s, p = 0.03, mu = 6.3, x1 = -0.5, x2 = 0.5){
  s <- abs(s)
  q <- pnorm(x2, mu, s) - pnorm(x1, mu, s)

fit <- stats::optimize(ssd, interval = c(0, 20))
sigma <- fit$minimum

It feels like I should be able to find \sigma subject to my constraint via maximum likelihood inside Stan but I can’t seem to pin it down.

Can somebody please point me in the right direction?

Many thanks in advance,

1 Like

I haven’t troubleshot this locally, but one possibility is this: sigma gets initialized on the unconstrained scale between -2 and 2. If unconstrained sigma initializes at -2, then the constrained sigma initializes at e^{-2} = 0.135. If this happens, it will be sufficient to cause normal_cdf(-0.5 | 6.3, sigma) to underflow, and I’m not sure how optimization would behave. See 18.1 Normal distribution | Stan Functions Reference