Beta prior for group effects?

I’m trying to write a model that could be a fairly straightforward logistic regression with a grouping variable, but I’m considering using a Beta prior for the distribution of group effects, and am wondering if that seems like a really bad idea to anyone here.

Here’s what the ordinary logistic regression would look like:

data {
  int<lower=0> N;
  // number of groups
  int<lower=0> G;

  vector[N] x;
  // group index
  int<lower=0> group[N];
  int<lower=0,upper=1> y[N];
}
parameters {
  real alpha;
  real beta;
 
  real group_effect[G];
  real<lower=0>  group_sd;
}
model {
  vector[N] ghat; 

  group_sd ~ student_t(3, 0, 2.5);
  group_effect ~ normal(0, group_sd);

  // vectorizing
  for(n in 1:N){
    ghat[n] = group_effect[group[n]];
 }

  y ~ bernoulli_logit(alpha + beta * x  + ghat);
}

But, an open question for this problem is whether the distribution of groups is unimodal, or could perhaps be bimodal (i.e. doers and non-doers). A beta distribution with a small \phi parameter could approximate that, so I’m considering this alternative model.

parameters {
  real alpha;
  real beta;
 
  // group_effect, now with beta prior, needs bounds
  real<lower=0, upper=1>  group_effect[G];
  real<lower=0>  phi;
}
model {
  vector[N] ghat; 

  // prior for phi should have heavier right tail
  phi ~ student_t(3, 0, 20);
  group_effect ~ beta(0.5 * phi, 0.5 * phi);

  for(n in 1:N){
    // logit transform to add into final formula
    ghat[n] = logit(group_effect[group[n]]);
  }

  y ~ bernoulli_logit(alpha + beta * x + ghat);
}

Now, when phi is relatively large, the distribution of groups ought to be unimodal around alpha + beta * x, but as it gets smaller, around <2, the distribution of groups will start bifurcating.

This seem like what I want, and testing the model on simulated data hasn’t raised any problems, but if there are any concerns about the principle of what I’m doing here, I’d definitely like to know!

I’d advise not to do like this. Even, if it works somehow. You are losing the interpretability of the parameters and in some aspects the sampler may get stuck in some cases not have been shown
up yet. Just my 2 cents.

Instead, I’d opt for a skew_normal for group_effect or some variants of student_t, either a skewed student_t or the exponentially modified normal distribution, see Stan User manual for details.