I don’t have much experience with Stan, so this question might be trivial instead of rather complicated, but I unfortunately couldn’t figure out what to do.
The model I want to specify has the following ingredients:
- Several years (y = 1, 2, \ldots, Y) of daily (d = 1, 2, \ldots, 365) data. (That is, for simplicity, we assume each year has 365 days.)
- The outcome at day t (t = 1, 2, \ldots, 365\cdot Y) is Poisson distributed. Let’s call y\left[t\right] the year and d\left[t\right] the day of t (i.e., t = 365 \cdot y\left[t\right] + d\left[t\right]-365).
- The parameter of the Poisson distribution at day t is the following: \alpha + \beta t + \sum_{y=1}^Y I_{y\left[t\right]} \cdot \gamma_{y\left[t\right]} \cdot f_{\mu_{y\left[t\right]}, \sigma_{y\left[t\right]}^2}\left(t\right), where f_{\mu, \sigma^2} is the normal density with expected value \mu and variance \sigma^2.
- Parameter I_y has an iid Bernoulli distribution with parameter (or should I call it hyperparameter?) p in each year, and \gamma_y is iid uniformly distributed from \gamma_{min} to \gamma_{max} (also (hyper)parameters) in each year.
In other words: we have a linear trend throughout the entire period, and in each year, there is possibly – but not necessarily – an additional peak (with the constant probability p, and independent in each year), the shape of which is the normal density centered at \mu which is uniform from 1 to 365 in that year (there are no parameters governing \mu), and a constant \sigma^2 that is a parameter to be estimated. (So f here doesn’t mean a normal random variable, but rather the density.) With \alpha>0 and \beta>0 the positivity of the Poisson’s parameter is ensured.
One could generate such data with the following code:
library(ggplot2)
simdat <- function(alpha = 2, beta = 0.1, gammamin = 10000, gammamax = 50000,
p = 0.4, sigm = 20) {
SimData <- expand.grid(d = 1:365, y = 1:20)
SimData$t <- 365*SimData$y + SimData$d - 365
gamma <- runif(20, gammamin, gammamax)
i <- rbinom (20, 1, p)
mu <- runif(20, 1, 365) + (1:20)*365 - 365
SimData$lambda <- alpha + beta*SimData$t +
rowSums(sapply(1:20, function(y) i[y]*gamma[y]*dnorm(1:(20*365), mu[y], sigm)))
SimData$x <- rpois(20*365, SimData$lambda)
SimData
}
SimData <- simdat()
ggplot(SimData, aes(x = t, y = x)) + geom_line()
ggplot(SimData, aes(x = d, y = x)) + geom_line() + facet_wrap(~y)
The parameters are: \alpha, \beta, p, \gamma_{min}, \gamma_{max} and \sigma^2. These should be estimated from the empirical data. (We could assume any meaningful default prior for them.)
Is it possible with Stan…?
Thank you in advance!