Inhomogenous Poisson Point Process

tleahy · September 13, 2017, 11:15am

What I have: points (t) in space and time where something happens (event) denoted 1 with covariate values (X, spatial covariates). I have also sampled the space and I have locations where there are no events denoted 0.

Intensity for the process: lambda(t) = exp(X(t)^{T} beta), where beta is the vector of parameters.

I want to fit an inhomogeneous Poisson Point Process model with the intensity function as above. I have a vector of 1’s and 0’s along with the corresponding covariate vales.

Is it possible to model this type of model in Stan? I have no idea how to go about it, since I don’t have “counts” per se.

Thank you in advance.

sakrejda · September 13, 2017, 1:02pm

Can you be clearer about what days you have? Is this absence-only data?

tleahy · September 13, 2017, 1:05pm

It is presence only data, however I can artificially generate absence data also from randomly sampling the space and finding their covariates.

sakrejda · September 13, 2017, 1:32pm

This type of data, at least in ecology, is so biased that you might as well show a 2d point plot and save everyone some time. Now that I got that off my chest:

In the context of a poisson process presence-only data contributes Pr(X >0) for each observation. You can code it in Stan as 1-poisson_lpmf(0|lambda) and add the ihpp part as usual.

It’s usually a terrible idea unless you have strong priors, or strong outside data, or a strong mechanism to build on. Stan is fantastic for building or evaluating a footgun so good luck!

tleahy · September 13, 2017, 1:36pm

For reference:

Thank you for the advice much appreciated.

sakrejda · September 13, 2017, 1:54pm

That’s a very kind response to advice you didn’t ask for! With ocean-sampling data you sometimes can construct a reasonable proxy for effort as a function of space based on external data (catch per unit effort and logs of effort by spatial location) and you could reasonably use that to resample for artificial absences.

I’m still doubting whether you could do better than a plot in terms of extracting covariates or anything like that :)

tleahy · September 13, 2017, 1:57pm

ok thank you

anon75146577 · September 13, 2017, 5:25pm

I do not agree with this statement.

Now that I’ve got that off my chest: Specifically for this type of data, I personally wouldn’t use Stan. (It’s not flexible enough to efficiently fit the types of models you should use for this sort of data). I would use this software and recommend reading this paper, which talks explicitly about ways of taking the sampling design into account. (The software implements that model and more. See the doc).

tleahy · September 13, 2017, 5:42pm

Thanks Daniel, very useful input. On a quick scan of the paper it looks like something in the right direction.

sakrejda · September 13, 2017, 6:30pm

Can you share a paper that you think represents a particularly useful application?

anon75146577 · September 13, 2017, 6:39pm

I did.

sakrejda · September 13, 2017, 6:41pm

Ok

sakrejda · September 13, 2017, 6:46pm

Transects are hardly “presence-only” data. They give you a direct measurement of how much effort was expended. I think we’re talking past each other here. I’m familiar with the literature where people try to use positive observations only when nobody bothered to record where sampling effort was expended or how heterogeneous it was. Transects I’m fine with, dido with designs that record how much effort was expended in other ways.

anon75146577 · September 13, 2017, 6:48pm

Even people who work with MaxEnt typically do more than that. They just use the distribution of “pseudo-absences” as an “alternative” to explicitly considering sample design.

sakrejda · September 13, 2017, 7:16pm

It would be fascinating to see data on what people do. Both at ESA and informally I saw a lot of
people treating presence-only-data methods as some sort of magic much like the early response to state-space models with unobserved states.

Sure, and in my experience they generally treat the generation of pseudo-absences as a nuissance and don’t give it much attention. The results are probably pretty insensitive to it if you’re looking at abundance estimates or something like that but often what ecologists want are effects of spatio-temporally varying covariates.

Bob_Carpenter · September 17, 2017, 5:19pm

What would you need in Stan to make you change your mind? Is it low-level control over sparsity or high-level interfaces to map/data stuff or both or something else? I can bug you about this in person tomorrow!

anon75146577 · September 17, 2017, 8:03pm

Quite a lot of things! These are one of my goal models.

Things that aren’t currently in Stan:

sparse precision parameterisation of the normal )or FFT Normal
1D/2D numerical integration (for evaluation the likelihood)
non-diagonal adaptation (decorrelating based on the prior + diagonal
adaptation won’t be good enough)
ideally a version of RMHMC that doesn’t compute functions of dense
matrices, which doesn’t scale [realistic version of both of these probably
need a lot more user input into the allowable adaptations of the mass
matrix]
to do this on large problems, the linear algebra will probably need to be
parallelisable

Even with all of that, the INLABru package is specifically designed for
these models, so it should be much easier to use than trying to make stan
do it (especially things like the integrations)

bbbales2 · September 17, 2017, 9:57pm

Even with all of that, the INLABru package is specifically designed for these models

I’m surprised that your list doesn’t have more of a Laplace approximation feel to it? Especially if something called INLABru does so well with these models?

Is the Laplace approximation hidden in one of these things, or are you talking solving these problems another way?

Bob_Carpenter · September 17, 2017, 10:00pm

Those are the component’s INLA using that we’d need to use in order to have a chance at fitting these with ( R )HMC.

NONMEM is also using block diagonal mass matrices—don’t know how well it’s working for them, but it’s an alternative to dense or generically sparse.

We have non-diagonal adaptation, but it’s both hard to fit and then even more brittle to varying posterior curvature.

[edit: Really, Discourse? I meant “( R )HMC” without the spaces, but no idea how to turn off the markdown expander that turns “(R)” into ® without changing fonts.]

anon75146577 · September 17, 2017, 10:02pm

How or where or why to implement Laplace approximations in Stan is a different question.

We’ve had some back and forth about it, but I don’t think it’s landed anywhere yet (except for an acknowledgement that they work well for some models and poorly for others, so they are not like the MCMC component of Stan but more like the ADVI component [although we know a lot more about their limitations]).

The roadmap to this (wherever it leads) needs to go through sparse cholesky factorisations and exact maginalisation of the latent gaussian component in lmer-type functions. Then it’s to get GMO to work properly. Then it’s to see how laplace approximations play with Stan.

Topic		Replies	Views
Model for point process in Stan Modeling	3	174	August 16, 2024
Point pattern Modeling	5	684	October 30, 2017
Latent Poisson Regression - Possible in Stan? Modeling poisson , count-data	9	305	June 22, 2024
Areal downscaling (spatial disaggregation) General rstan	16	1086	May 13, 2021
Marginalizing occupancy models for Stan implementation Publicity ecology	8	1499	November 23, 2022

Inhomogenous Poisson Point Process

Related topics