# Using Poisson likelihood with spike counts using unequal bin sizes

I’m comparing the difference in firing rates for a population of neurons (i.e., cells) in the brain under two conditions (A and B). For each neuron, we have four data points:

``````- sample_time_A
- spike_count_A
- sample_time_B
- spike_count_B
``````

For each neuron in each condition, we counted the number of events (`spike_count`) occurring within a fixed interval (`sample_time` in seconds). We are interested in determining whether the rate (number of events per second) differs between the two conditions across the two populations. This is the distribution of the rates (`spike_count_A/sample_time_A` is blue, `spike_count_B/sample_time_B` is blue):

Now, this seems like a very straightforward t-test or Wilcoxon sign test to see if the means of `A` and `B` are different. I can just compute `rate_A` and `rate_B` (i.e. `spike_count/sample_time`) and use a paired t-test or Wilcoxon sign test. Both return p-values that are significant (<= 0.01).

However, I thought it would be fun to test my rudimentary knowledge of Bayesian statistics. Since `rate_A` and `rate_B` can be modeled by a gamma distribution and the observed variable (`spike_count`) can be a poisson, why not try it out? However, I am getting a result that tells me the difference in rate between the two conditions is not significant (i.e., the 90% credible interval overlaps with 0) as shown by the posterior for the difference in the means of `rate_A` and `rate_B`.

I calculated the posterior for the mean of `A` by dividing the posterior of `sr_alpha_A` by the posterior of `sr_alpha_B` (since the expected value of a gamma distribution is alpha divided by beta). For `B`, the calculation was `(sr_alpha_A + sr_alpha_delta_B) / (sr_beta_A + s r_beta_delta_B)`. The posterior for the difference in the means of `A` and `B` is the difference between the relevant posteriors.

The lack of significance is fine with me. If it’s not significant, it’s important to report that result in the paper. However, I thought that the paired t-test had pretty weak assumptions about normality. Further, isn’t the Wilcoxon supposed to be non-parameteric? So, shouldn’t the Bayesian and frequentist approaches agree? This makes me wonder if I set up my model wrong. Here it is:

``````data {
int<lower=1> n_cells;
real<lower=0> sample_time_A[n_cells];
real<lower=0> sample_time_B[n_cells];
int<lower=0> spike_count_A[n_cells];
int<lower=0> spike_count_B[n_cells];
}

parameters {
real<lower=0> sr_alpha_A;
real<lower=0> sr_beta_A;
real sr_alpha_delta_B;
real sr_beta_delta_B;

real<lower=0> sr_cell_A[n_cells];
real<lower=0> sr_cell_B[n_cells];
}

model {
sr_alpha_A ~ gamma(0.5, 0.1);
sr_beta_A ~ gamma(0.1, 0.1);
sr_alpha_delta_B ~ normal(0, 0.01);
sr_beta_delta_B ~ normal(0, 0.01);

sr_cell_A ~ gamma(sr_alpha_A, sr_beta_A);
sr_cell_B ~ gamma(sr_alpha_A + sr_alpha_delta_B, sr_beta_A + sr_beta_delta_B);

for (i in 1:n_cells) {
spike_count_A[i] ~ poisson(sr_cell_A[i] * sample_time_A[i]);
spike_count_B[i] ~ poisson(sr_cell_B[i] * sample_time_B[i]);
}
}
``````

I would really appreciate any pointers!

So, I just realized that the Bayesian model is not properly accounting for the fact that the observations are paired. Any advice on how I should do this? Most of the examples I’ve seen are for normal distributions, so it’s a bit unclear how I might do this when dealing with gamma distributions.

Problem solved. It was a matter of ensuring that the paired samples were modeled properly in the model. New model below:

``````data {
int<lower=1> n_cells;
real<lower=0> sample_time_A[n_cells];
real<lower=0> sample_time_B[n_cells];
int<lower=0> spike_count_A[n_cells];
int<lower=0> spike_count_B[n_cells];
}

parameters {
real<lower=0> rate_alpha;
real<lower=0> rate_beta;
real<lower=0> rate_cell[n_cells];

real<lower=0> rate_ratio_mean;
real<lower=0> rate_ratio_sd;
real<lower=0> rate_ratio_cell[n_cells];
}

transformed parameters {
real rate_A;
real rate_B;
real rate_change;
real rate_cell_A[n_cells];
real rate_cell_B[n_cells];
real rate_cell_change[n_cells];

rate_A = rate_alpha/rate_beta;
rate_B = rate_A * rate_ratio_mean;
rate_change = rate_B - rate_A;

rate_cell_A = rate_cell;
for (i in 1:n_cells) {
rate_cell_B[i] = rate_cell[i] * rate_ratio_cell[i];
rate_cell_change[i] = rate_cell_B[i] - rate_cell_A[i];
}
}

model {
rate_alpha ~ gamma(0.5, 0.1);
rate_beta ~ gamma(0.1, 0.1);

rate_ratio_mean ~ normal(1, 1)T[0, ];
rate_ratio_sd ~ normal(0, 1)T[0, ];

rate_cell ~ gamma(rate_alpha, rate_beta);
rate_ratio_cell ~ normal(rate_ratio_mean, rate_ratio_sd);

for (i in 1:n_cells) {
spike_count_A[i] ~ poisson(rate_cell[i] * sample_time_A[i]);
spike_count_B[i] ~ poisson((rate_cell[i] * rate_ratio_cell[i]) * sample_time_B[i]);
}
}
``````
2 Likes

Thanks for sharing, what were the final results?

And I don’t understand the `T[0, ]` here

``````   rate_ratio_mean ~ normal(1, 1)T[0, ];
``````

do you mind explaining it?

It truncates the distribution. `normal(mean, sd)T[lower, upper]`. If lower or upper are left blank, the distribution is unconstrained in that direction. So, using `normal(0, 1)T[0, ]` creates a half-normal distribution with a standard deviation of 1. You can create a half-cauchy using this approach as well.

1 Like

That’s what I thought, but isn’t that redundant with the way you define this parameter?

``````real<lower=0> rate_ratio_mean;
``````

I don’t know enough about STAN to know if it was necessary to do both. Sounds like it’s not necessary? I thought `T[0,]` was the correct way to define half-priors. However, I found that I must do `real<lower=0>` to avoid initialization errors. If I just use `T[0,]`, then I get initialization errors (since STAN is trying to use negative values). So, that’s why I have both in my code.

Is the correct way to implement a half-prior to use `real<lower=0>`?

I think you can just use

``````real<lower=0> rate_ratio_mean;
``````

Thanks!