I’ve been investigating the use of link functions for improving the accuracy of posterior estimation in rare event problems. I have already successfully generated results with stan for a Poisson regression problem using an inverse gamma link function.

I am now tackling the problem of a using an inverse beta link function in a logistic regression problem. Here is my code:

```
functions {
real invBetaSpec(real x);
real normal_lcdf3(real x);
}
data {
int N; //Num trials
int M, //Num covariates
int y[N]; //Observations,
// covariates
matrix[M,N] X;
}
parameters {
row_vector[M] beta;
real beta0;
}
transformed parameters {
row_vector[N] alpha;
vector[N] not_Logistic_alpha;
alpha = beta*X;
alpha += beta0;
for (n in 1:N){
not_Logistic_alpha[n] = invBetaSpec(exp(normal_lcdf3(alpha[n])));
}
}
model{
beta ~ std_normal();
beta0 ~ std_normal();
y ~ bernoulli(not_Logistic_alpha);
}
```

The problem I get is that at least one of the chains usually gets stuck with a small step size during the warmup adaptation period, especially for large N (>100). This results in sampling times of several hours followed by lots of `max_treedepth`

warnings and poor mixing. On lucky occaisons and with small N there can be times when each chain adapts to a suitably large enough step size and the sampling finishes within a minute or so without any warnings. Reducing `adapt_delta`

helps in some cases but the problem still occurs for large N.

To help speed things up I have already written the C++ functions `invBetaSpec`

and `normal_lcdf3`

which define the inverse beta and normal log cdf functions for a specific set of parameters. In both cases the gradient is also defined (for `invBetaSpec`

this is via a lookup table as an analytic gradient doesn’t exist). However, none of this seemed to improve the situation.

The equivalent model using `bernoulli_logit()`

works without problems on the same data, even for large N. I thought the issue might be with the topology of the problem when using `bernoulli()`

, so I wrote code for the equivalent model without the link function:

```
data {
int N; //Num trials for each group
int M, //Num covariates
int y[N]; //Observations,
// covariates
matrix[M,N] X;
}
parameters {
row_vector[M] beta;
real beta0;
}
transformed parameters {
row_vector[N] alpha;
vector[N] logistic_alpha;
alpha = beta*X;
alpha += beta0;
for (n in 1:N){
logistic_alpha[n] = 1/(1+N*exp(-alpha[n]));
}
}
model{
beta ~ std_normal();
beta0 ~ std_normal();
y ~ bernoulli(logistic_alpha);
}
```

This model runs without any problems. So it seems that the link function alone is somehow responsible for the issue in step size selection during adaptation. Any ideas why this might be and how I might fix it?