Inference of a lower bound and strictly positive noise

amas0 · December 14, 2023, 4:15pm

I’m working with data that represents the time it takes to complete a task, t_i. Task i has a covariate x_i > 0 and a justifiable way to represent the time it takes to complete the task is t_i = \alpha + \beta x_i + \varepsilon where \alpha >0, \beta \ge 0 and \varepsilon > 0 is some noise term. In a sense, the quantity \alpha + \beta x_i represents the minimum time it takes to complete task i. \varepsilon is additional strictly positive noise (assuming something like \varepsilon | \nu, \sigma \sim \mathrm{LogNormal}(\nu, \sigma)) that can only have the effect of increasing the amount of time it takes to complete a task. My goal here is to perform inference on (\alpha, \beta, \varepsilon) but I’m having difficulty figuring out how to properly specify this model in Stan.

I’ve tried some crude approaches such as:

data {
  int<lower=1> N;
  vector<lower=1>[N] X;
  vector<lower=0>[N] T;
}
parameters {
  real<lower=0> alpha;
  real nu;
  real<lower=0> sigma;
}
transformed parameters {
  real<lower=0, upper=min(T - alpha) ./ X)> beta;
  vector<lower=0>[N] errors;

  for (n in 1:N) {
    errors[n] = T[n] - (alpha + beta*X[n]);
  }
}
model {
  alpha ~ normal(0, 2);
  beta ~ normal(0, 2);
  sigma ~ exp(1);
  nu ~ normal(0, 1);

  errors ~ lognormal(nu, sigma);
}

But I end up with beta being nan in some cases and other attempts have resulted in pretty extreme divergences if I can get the model to sample at all. I’ve tried myself into some mental knots trying to figure this out, so I’d appreciate any pointing in the right direction or an example of how one could fit a model of this sort.

Corey.Plate · December 14, 2023, 4:25pm

Hey, amas0, could you share the full code, including where T, N, X, and mu are read in or defined?

amas0 · December 14, 2023, 4:48pm

Edited the original post to include the data { } block; also mu was a typo and should have been nu, so I fixed that as well.

nhuurre · December 14, 2023, 4:59pm

This might work:

parameters {
  real<lower=0, upper=min(T)> alpha;
  real<lower=0, upper=min((T - alpha) ./ X)> beta;
  real nu;
  real<lower=0> sigma;
}
transformed parameters {
  vector<lower=0>[N] errors = T - (alpha + beta*X);
}

amas0 · December 14, 2023, 5:03pm

This got me there, thanks! I assumed only bounding beta, since it was dependent on alpha would be sufficient, but my mistake.

Appreciate the quick response.

jsocolar · December 14, 2023, 5:15pm

Jut a note: assuming lognormal noise, I think this model should be estimable directly in brms via the shifted_lognormal family with distributional regression on the shift parameter. It might be instructive to look at the stancode generated there.

Bob_Carpenter · December 15, 2023, 5:54pm

Why take

t_i = \alpha + \beta \cdot x_i + \epsilon, \quad \epsilon \sim \textrm{lognormal}(\nu, \sigma)

rather than the more traditional log-linear regression,

t_i \sim \textrm{lognormal}(\alpha + \beta \cdot x_i, \sigma).

I would be surprised if BRMS used the OP’s formulation rather than standard loglinear regression.

The Wikipedia isn’t very helpful here, but this is what I’m talking about: Log-linear model - Wikipedia

The previous post on this topic was never answered. :-(

jsocolar · December 15, 2023, 6:59pm

@Bob_Carpenter these are different models, right? For positive \alpha + \beta x_i, the OP’s model places no probability density between zero and \alpha + \beta x_i. AFAIK it’s for this reason that brms contains the shifted_lognormal family.

amas0 · December 15, 2023, 7:22pm

@Bob_Carpenter we did initially evaluate using the t_i \sim \mathrm{lognormal}(\alpha + \beta\cdot x_i, \sigma) model. It worked fairly well, but the point that @jsocolar noted was more or less why we wanted to try out the shifted model. In particular, we found the model put some mass in regions didn’t track with domain knowledge.

Also, from an inference perspective, we the \alpha + \beta\cdot x_i quantity to more interpretable in the shifted model.

That being said, when we fit it to our actual data there was enough uncertainty that the differences between the two models weren’t really that significant.

Topic		Replies	Views
Sampling from truncated normal distribution Modeling fitting-issues , specification	5	804	October 11, 2021
Predictive projection for linear model with student-t noise: clarifying question(s) Modeling model-selection	24	1044	June 1, 2022
Expecting time to be greater than 0 Modeling	3	423	September 9, 2022
Modeling observed sums of positive latent random variables Modeling techniques , specification , state-space	0	326	June 13, 2023
Constraints and target log density output Modeling	1	410	July 18, 2020

Inference of a lower bound and strictly positive noise

Related Topics