# Maximum of n i.i.d. exponentials

I’ve recently started playing around with Stan and I had this idea of a model that models the maximum of a number of i.i.d. variables. I generated some data where each entry is the max of 100 i.i.d. exponential variables. I then set up the model below. It compiles and the sampling runs ok, but n is stuck at 1.0. Can anyone explain why this doesn’t work?

``````data {
int<lower=0> N;
real<lower=0> y[N];
}
parameters {
real<lower=0> mu;
real<lower=1> n;
}

model {
target += log(n) + (n-1)*exponential_lcdf(y | mu) + exponential_lpdf(y | mu);
}``````

Conceptually, `n` is an integer, so I don’t think Stan is going to be able to get its posterior distribution. For large n, I think `y` has a Gumbel distribution, but by assuming `n` is large, you make it impossible to estimate when `n` is finite.

This is true, but I don’t see why it would trip up Stan. Maximizing the same log likelihood function constructed with SciPy I can recover the parameters, yet Stan’s optimizer and sampler agree that n=1.

I just simulated data in R

``````mu <- 0.5
I <- 75
n <- 100
y = rep(NA, I)
for (i in 1:I) y[i] <- max(rexp(n, mu))
``````

and then wrote the objective function out,

``````f <- function(y, n, mu) {
log(n) +
(n - 1) * sum(pexp(y, mu, log.p=TRUE)) +
sum(dexp(y, mu, log=TRUE))
}
``````

and then trying it on some values, I get that 2 is the best integer value

``````\$ for (n in 1:20) print(sprintf("%3d %6.2f", n, f(y, n, mu)), quote=FALSE)
[1]   1 -436.34
[1]   2 -436.33
[1]   3 -436.60
[1]   4 -436.99
[1]   5 -437.44
[1]   6 -437.94
[1]   7 -438.46
[1]   8 -439.01
[1]   9 -439.57
[1]  10 -440.14
[1]  11 -440.72
[1]  12 -441.31
[1]  13 -441.91
[1]  14 -442.51
[1]  15 -443.12
[1]  16 -443.73
[1]  17 -444.35
[1]  18 -444.97
[1]  19 -445.59
[1]  20 -446.22
``````

@Bob_Carpenter Thanks for doing this and helping a novice. When I saw you write out the objective function I realized my embarrassing mistake. The objective function should naturally be

``````target += N*log(n) + (n-1)*exponential_lcdf(y | mu) + exponential_lpdf(y | mu);
``````

I forgot to multiply `log(n)` with the number of data points.