Fitting model with decreasing cumulative costs

Hi!

Suppose I lend people money and they repay in smaller amounts over month. I expect all clear at due date. I would like to know how much of the money I get after X% of the time and how certain I can be.

library(brms)
library(tidyverse)
library(lubridate)
library(tsibble)
library(ggplot2)

df_raw <- tribble(
  ~Person, ~Date, ~Amount, ~Comment,
  "A", "2000-01-01", 1000, "Borrow",
  "A", "2000-02-01", -200, "",
  "A", "2000-03-01", -10, "",
  "A", "2000-05-01", -300, "",
  "A", "2000-06-01", -100, "Due date",
  "B", "2000-03-01", 2000, "Borrow",
  "B", "2000-04-01", -420, "",
  "B", "2000-05-01", -20, "",
  "B", "2000-06-01", -500, "Due date",
  "B", "2000-10-01", -620, "",
  "C", "2000-03-01", 3000, "Borrow",
  "C", "2000-04-01", -500, "",
  "C", "2000-05-01", -30, "",
  "C", "2000-06-01", -900, "",
  "C", "2000-07-01", -300, "Due date",
  "C", "2001-10-01", -300, ""
)

df <- df_raw |>
  mutate(Date = yearmonth(Date)) |>
  group_by(Person) |>
  mutate(
    loan_start = first(Date[Comment == "Borrow"]),
    borrowed_amount = first(Amount[Comment == "Borrow"]),
    amount_open = cumsum(Amount), 
    relative_amount = amount_open / borrowed_amount,
    due_date = first(Date[Comment == "Due date"]),
    loan_duration = as.numeric(due_date - loan_start),
    time_point = as.numeric(Date - loan_start) / loan_duration
  ) |>  
  ungroup() |>
  select(-Comment)

ggplot(df, aes(x=time_point, y=relative_amount, color = Person, group = Person)) +
  geom_line() +
  geom_point()

The relation must not be linear. The relative amount should be 1 at time 0 because I divided by a reference amount. I would like to learn from all persons.

df1 <- df |>
  filter(between(time_point,0,1))

fit1 <- brm(
  formula = bf(relative_amount ~ time_point + (1 | p | Person)),
  data = df1,
  family = gaussian(),
  seed = 12345 
)
summary(fit1)

Is this the right approach? How to specify the priors? Thanks,

Side comment: examples requiring excessive dependencies (here, the tidyverse) are less useful for others who I may want to run your code. You could have set this up easily with plain R.

1 Like

It seems like you’re throwing away information by normalizing all loan amounts to start at 1.0. If the payback rate varies based on the starting amount, the “relative amount” model wouldn’t capture that.

1 Like

Thank you for your help.

Suppose I don’t divide by the borrowed amount. Then

ggplot(df1, aes(x=time_point, y=amount_open, color = Person, group = Person)) +
  geom_line() +
  geom_point()

I guess something like

fit2 <- brm(
  formula = bf(amount_open ~ 1 + s(time_point) + (1 + time_point | Person)),
  data = df1,
  family = gaussian(),
  seed = 12345 
)

conditional_effects(
  fit2, 
  conditions = distinct(df1, Person), 
  re_formula = NULL
)

For example, how would I make a prediction if I lend 1300 today. What range do I expect to get back after 75 % of the time?