# How to fit a simple poisson process

I’m trying to use stan to fit a simple poisson process model, and, maybe, to simulate data as well. Starting with the fitting.

In base R, I’d simply write something like:

`````` Y<- (cumsum(rpois(102,lambda=5)))
x<-cumsum(c(0,rep(1,100)))
T <- length(Y)
plot(stepfun(x,Y),xlim = c(0,100), do.points = F)
``````

This would return a plot like, with the event at time t on the x axis, and the number of counts Y at each t:

It’s not clear to me how to write the cumulative_sum in stan in such a way that we could either simulate or estimate our lambda parameter. A simple attempt to fit might look like this, provided the data above (this fails):

``````data {
int<lower=0> T; // number of events
int Y[T]; // the observed counts at event time T
}
parameters {
real<lower=-10,upper=10> lambda;
}
model {
// Priors
lambda ~ normal(5, 0.5); // our lambda estimate

for (t in 1:T) {
Y ~ cumulative_sum(poisson(lambda));
}
}
``````

This doesn’t work, however. The cumulative_sum and poisson variables don’t play nice together. I’ve tried setting it up with `target +=` and `poisson_lpmf` instead of `poisson`, but wasn’t able to suss out the correct syntax.

Any advice appreciated. Similar to this question which wasn’t followed-up by the original poster.

The `poisson` function does not generate Poisson distributed random variables. Rather it is sort of an alias for the log-PMF of the Poisson distribution. Also, you need to restrict `lambda` to be positive and probably should not restrict `lambda` to be less than 10. You could do something like

``````data {
int<lower = 0> T;
int<lower = 0> Y[T];
}
parameters {
real<lower = 0> lambda;
}
model {
vector[T] lambda_vec;
lambda_vec = lambda;
for (t in 2:T) lambda_vec[t] = lambda_vec[t - 1] + lambda;
target += normal_lpdf(lambda | 5, 0.5);
target += poisson_lpmf(Y | lambda_vec);
}
``````
2 Likes

Thanks, @bgoodri, that worked really well. Is also really helpful in understanding the lpmf functions a bit better, need to spend some time on that.

After I posted, I also tried fitting the model this way, which seemed to return nearly the same estimates. Not sure if it has any side effects by constituting lambda how it does.

``````    data {
int<lower=0> T; // number of events
int Y[T]; // the observed counts at event time T
int ages; // number of ages
}
parameters {
real <lower=0>lambda; // will estimate lambda
}
model {
// Priors
lambda ~ normal(5, 0.5); // our lambda prior

for (t in 1:T) {
Y ~ poisson(lambda * t);
}
}
``````

That way is fine, but slower because it has to call `poisson_lpmf` T times instead of once, and hence has to do T memory (de)allocations instead of one.

1 Like

``````for (t in 1:T) lambda_vec[t] = t * lambda;
``````

which would reduce the amount of autodiff I think.

1 Like

@bgoodri

Experimenting more I realized neither model is really doing a very good job of estimating lambda.

For the data simulated with a lambda of 3, with your stan code, I’m getting:

``````                mean se_mean   sd     2.5%     25%     50%     75%   97.5%   n_eff Rhat
lambda          2.86    0.00  0.02    2.82    2.85    2.86    2.88    2.91   934    1
``````

For mine, I’m getting a sd of 0.00, same 50%. This is with a cumulative sum of 285 at t=100. It’s not unusual for rpois draws to be between 280 and 320 cumulatively by t=100, but obviously the CI is fairly small here.

I realized my model doesn’t make sense because it doesn’t assume the sums are correlated but instead just takes the poisson for `t*lambda` at any given t. That’s not really a poisson process so much as taking the poisson for `lambda * t`.

In my case, the model is badly overfit. With yours, it seem slightly overfit. I don’t understand well enough what the `poisson_lpfm` function is doing but it seems like what I really need to do is to somehow indicate more strongly to stan that the sums are cumulative poisson draws.

Something like:

``````    data {
int<lower = 0> T;
int<lower = 0> Y[T];
}
parameters {
real<lower = 0> lambda;
}
model {
target += normal_lpdf(lambda | 5, 0.5);
Y ~ poisson(lambda);
for (t in 2:T) {
Y[t] ~ poisson(lambda) + Y[t - 1]; // some way of indicating every t is an additional poisson added to the previous draws
}
}
``````

For larger initial lambda, the confidence intervals are still very small, about +/- 0.03 at two standard deviations, even when the estimates are off by 10% or so.

For example, if I don’t sum the `rpois` function… and just simulate 100 poisson draws, then use stan to estimate, the confidence intervals look fine:

``````              mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
lambda        2.76    0.01 0.17    2.44    2.65    2.76    2.87    3.11   870 1.01
``````
1 Like

I ended up here, which seems to do the trick. I’d like to just have a way of indicating `cum_sum`… but the diff also works. Thank you, poisson memoryless property…

## Data generation:

``````  Y<- cumsum(rpois(100,lambda=3))
T<- length(Y)
``````

## Stan code

``````data {
int<lower=0> T; // number of events
int Y[T]; // the observed counts at event time T
}
parameters {
real<lower=0> lambda;
}
model {
// diff container
int Yt[T];
// Prior
lambda ~ gamma(5, 1);

Yt = Y;
for (t in 2:T) Yt[t] = Y[t] - Y[t - 1]; // in the absence of cumsum calculate the diff
target += poisson_lpmf(Yt | lambda);
}
``````
1 Like