Continuous bernoulli

spinkney · March 21, 2022, 4:54pm

@StaffanBetner I was playing around with the continuous bernoulli and if p is a vector I needed to divide the normalizing constant by the sample size.

The stan code. Note that in Stan 2.29 you can declare the same function name with different inputs, one is a vector of p and the other is a real p. Does this seem right?

 real continuous_bernoulli_lpdf(vector y, vector lambda) {
   int N = num_elements(y);
   real lp = N * log2();
   real lp_c = 0;
   int counter = 0;
   
   for (n in 1:N) {
    lp += y[n] * log(lambda[n]) + (1 - y[n]) * log1m(lambda[n]); 
      if (lambda[n] != 0.5) {
        counter += 1;
        lp_c += log( atanh(1 - 2 * lambda[n]) / (1 - 2 * lambda[n]) );
      }
   }
    return lp + lp_c / counter;
 } 
  
  real continuous_bernoulli_lpdf(vector y, real lambda) {
   int N = num_elements(y);
   real lp = N * log2() + sum(y * log(lambda) + (1 - y) * log1m(lambda));
   
    if (lambda != 0.5)
      lp += log( atanh(1 - 2 * lambda) / (1 - 2 * lambda) );
      
    return lp;
 }

saudiwin · February 15, 2023, 1:56pm

Hi @spinkey - did you ever figure out which of these is better?

saudiwin · February 17, 2023, 9:22am

Hi - check out this blog post. I added the dist to brms:

jsocolar · February 17, 2023, 5:47pm

Hi @saudiwin neat post! I have a residual confusion though: you mention that in your ordered beta paper you found that the fractional logit gave wildly varying performance and you didn’t recommend it. In the blog post, you suggest that the distribution was fixed by normalizing it. But including or excluding the normalization term should have no effect on the Stan model. Is the implication that the normalized model (i.e. the “continuous bernoulli”) still should perform badly in simulation (or at least in the simulation that you carried out)?

saudiwin · February 17, 2023, 5:53pm

This is not normalization in terms of the denominator in Bayes’ formula and Stan sampling, but rather of the fractional logit function to make it integrate to 1. You can fit the “fractional logit” model in Stan and I did it in my sim, but you can’t simulate from it as it has no CDF b/c it doesn’t integrate to 1. If you plug it into wolfram alpha and integrate it, it will pop back out the formula that the authors use as a “normalizing constant.” I.e., it makes the function integrate to 1.

You can simulate from continuous bernoulli and I could have/should have included it in the sim I ran, but I didn’t know it existed at the time. The first paper only came out in 2019.

It seems at first blush that the continuous bernoulli is a simplified parameterization vis-a-vis beta, but I have some doubts as I find the discontinuity rather strange. Plus the beta distribution has lots of established properties and continuous bernoulli very few. But it’s clear that continuous bernoulli does “work” and has fewer parameters than beta.

jsocolar · February 17, 2023, 6:17pm

Sorry if I’m being dense. Can we not drop the normalization term as we do with stan functions ending in the lupdf suffix?

saudiwin · February 17, 2023, 6:33pm

I mean yes for computational convenience, but you still need it at the end of sampling. Been a while since I looked at the specifics.

This is different though — it’s not about computational challenges but rather whether/how to normalise to make a proper PDF. Computationally it’s straightforward to do.

saudiwin · February 18, 2023, 11:37am

From the stan manual:

The built in distribution functions in Stan are all available in normalized and unnormalized form. The normalized forms include all of the terms in the log density, and the unnormalized forms drop terms which are not directly or indirectly a function of the model parameters.

So you can’t do that with the continuous Bernoulli because there is only one parameter and the normalizing constant is a function of that parameter. So there’s no way to drop it, sample, then renormalize–as far as I can tell.

You can fit the conventional fractional logit (without normalizing constant) in Stan so long as you have priors on coefficients, but you can’t convert it back to continuous Bernoulli - re: the linked paper proofs (which seem correct).

jsocolar · February 18, 2023, 12:56pm

Oh I see! Sorry; was being dense! When I saw “normalizing constant” I assumed that it was constant in the parameters.

saudiwin · February 19, 2023, 5:46am

No it’s not, which is a misnomer in the cited paper. Normalizing function would be more correct. It’s only constant for a given value of the parameter, which of course isn’t constant in any meaningful sense.

Topic		Replies	Views
Question about logistic regression RStan	5	1858	June 5, 2018
Fractional logit in Stan with code Publicity specification	4	874	March 22, 2022
Fractional logit model in Stan Modeling	2	970	October 1, 2019
Large data sets with stan code Modeling	1	682	September 18, 2018
Replicating bernoulli_logit_lpmf in R Modeling	2	397	August 11, 2021

Continuous bernoulli

Related topics