Convergence problem with simple triangle distribution

matsutakehoyo · March 16, 2023, 9:03am

I have a dataset of points (x, y) that are distributed within a disk, and I would like to estimate the radius (and eventually the center) of the disk using a Stan model. Here is an example with some simulated data in R

set.seed(42)

n_obs <- 1000
true_radius <- 2

# Generate random angles and distances from the center
angles <- runif(n_obs, 0, 2 * pi)
distances <- runif(n_obs, 0, true_radius)

# Calculate x and y coordinates
x <- distances * cos(angles)
y <- distances * sin(angles)

# Combine x and y coordinates into a data frame
data <- data.frame(x = x, y = y)

The histogram of the distances of each point from the center looks like a right triangle with the likelihood at 0 being 0 and the likelihood at r being 1, it’s a line from 0 to 1.

So I defined a custom likelihood in stan and run this model

functions {
  real triangular_lpdf(real x, real a, real b) {
    if (x < a || x > b)
      return negative_infinity();
    else
      return log(2 * (x - a) / square(b - a));
  }
}

data {
  int<lower=0> n_obs; 
  vector[n_obs] x; 
  vector[n_obs] y; 
}

parameters {
  real<lower=0> r; 
}

model {
  // Priors
  r ~ normal(2, 1);

  // Likelihood
  for (i in 1:n_obs) {
    real dist = sqrt(square(x[i]) + square(y[i]));
    target += triangular_lpdf(dist | 0, r);
  }
}

Although the model runs relatively fast, and does give a good estimate of r, it results in hundreds of divergent transitions (of 2000). I have tried adjusting the sampler parameters (adapt_delta and stepsize), but these just seem to slow the down the model with no improvements.

Trace plots for r don’t look good, but estimates are mostly right, and pairs plot shows that r is almost perfectly anti-correlated to lp__

This seems like a simple problem, and I feel like I am missing something basic. Any thoughts?

dmi3kno · March 16, 2023, 11:53am

This problem reminds me of my attempts to estimate the scale parameter in Govindarajulu distribution (or to that extent, any bathtub-shaped distribution where you are not sure what the upper boundary should be).

Couple of comments. Your data is uniform, so it does not help as much. The radiuses less than the maximum of data should be for all practical purposes impossible.
It feels like cheating to limit the r to the maximum of the data, but if you do that divergences go away.

functions {
  real triangular_lpdf(real x, real a, real b) {
    if (x < a || x > b)
      return negative_infinity();
    else
      return log(2 * (x - a) / square(b - a));
  }
}

data {
  int<lower=0> n_obs;
  vector[n_obs] x;
  vector[n_obs] y;
}

transformed data{
  vector[n_obs] dist=sqrt(square(x) + square(y));
  real max_dist=max(dist);
}

parameters {
  real<lower=max_dist> r;
}

model {
  // Priors
  r ~ normal(2, 1);

  // Likelihood
  for (i in 1:n_obs) {
    target += triangular_lpdf(dist[i] | 0, r);
  }
}

I am curious to hear of the clever tricks people use to estimate the upper boundary in the bathtub distribution.

Topic		Replies	Views
Newbie question: Triangle custom distribution returning unexpected values for a and b parameters Modeling	7	628	August 16, 2017
Help specifying a triangular prior Modeling	15	2298	September 24, 2021
Fitting a uniform distribution Modeling	7	1749	December 19, 2018
RStan divergence issue - assistance with experimental research Modeling rstan , fitting-issues	6	110	May 21, 2025
Divergent transitions in Beta model General fitting-issues , specification , performance , divergences	7	1281	February 7, 2022

Convergence problem with simple triangle distribution

Related topics