Intercept only model

  • Operating System: Windows
  • rstanarm Version: 2.19.2

I have a relatively simple problem - an assumed mean and variance for a normally distributed variable, followed by a few data observations (just outcomes, no predictors). I’d like to use the data to update my estimates of the mean and standard deviation of that mean using rstanarm.

library(rstanarm)

d ← data.frame(x = c(425, 425, 450, 425, 350, 450, 300, 425, 425, 300, 425, 425, 300, 425, 375, 425, 375, 425, 425, 275, 330, 425, 375, 425, 425))

prior_mu ← 350
prior_sd ← 60

m ← stan_glm(x ~ 1,
data = d,
prior_intercept = normal(prior_mu, prior_sd))
m

This yields a MAD intercept of 393 and MAD sigma of 53.

If I’ve understood correctly, my posterior mean and standard deviation would be 393 and 53 respectively.

  1. Have I correctly implemented the bayesian updating, and interpreted the outputs?
  2. How can I extract the MAD of sigma - coef() returns only the intercept and I don’t see a sigma value stored anywhere in object m?
  3. Can the same process be used if I only have a single new data point? When I try I get a an error:

Error: Constant variable(s) found: x

Thanks for your help.

2 Likes

Yes.

In general, if you want to deal with the output rather than the summary, you should do

params <- as.data.frame(m) # or as.matrix()

Now, the draws of sigma (and any other parameters) are in a column of params and you can do whatever you want with them, including

mad(params$sigma)

In principle, that should work but perhaps the code is not general enough to handle that case. You can append your new data point to your old data points and run stan_glm from there.

1 Like

Thank you. Regarding question, in this case I don’t have any prior data points - just prior summaries (mu, sigma). Is this possible? Take just a N(mu, sigma) prior and update it for a single new data point?

We would have to implement some logic for the one data point case. My guess is that brms would work with one data point.

brms worked, kind of. The model compiles and runs, but with a variety of sampling problems.

library(brms)
prior2 ← c(set_prior(“normal(350, 60)”, class = “b”, coef = “intercept”),
set_prior(“cauchy(0, 1)”, class = “sigma”))

d2 ← data.frame(Y=425)
fit2 ← brm(Y~0 + intercept, data=d2, prior = prior2, warmup=1000, iter=2000, sample_prior = TRUE, control = list(adapt_delta = 0.97))

Warning messages:
1: There were 194 divergent transitions after warmup. Increasing adapt_delta above 0.97 may help. See
Runtime warnings and convergence problems
2: Examine the pairs() plot to diagnose sampling problems

3: The largest R-hat is 1.15, indicating chains have not mixed.
Running the chains for more iterations may help. See
Runtime warnings and convergence problems
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
Runtime warnings and convergence problems
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
Runtime warnings and convergence problems

1 Like

Yeah, my guess is that a normal likelihood is not really appropriate if you do not condition on any predictors.

I’m open to suggestions for a better choice of likelihood.

In case of future interest concerning this issue, it isn’t limited to the situation of only a single datapoint. The same error occurs if you have multiple datapoints that are equal, like:

d <- data.frame(x = c(425, 425, 425))

1 Like