Lognormal model with only summary statistics



I’m trying to fit a lognormal model (code below). This works just fine if I have a vector y. E.g. in R,

y <- exp(rnorm(1000))
stan.data <- list(y = y, n = length(y), alpha = 1, beta = 1)
stan.fit <- sampling(stanmodels::lognormal, data = stan.data)

where stanmodels::lognormal is defined as:

data {
  real<lower=0> alpha;
  real<lower=0> beta;
  int<lower=0> n;
  real<lower=0> y[n];

parameters {
  real mu;
  real<lower=0> sigmasq;

transformed parameters  {
  real<lower=0> sigma;
  sigma = sqrt(sigmasq);

model {
  sigmasq ~ inv_gamma(alpha, beta);
  mu ~ normal(0, 1000);
  y ~ lognormal(mu, sigma);

However, in my actual problem, I don’t have y. Instead I have mean(y), sd(y), and length(y). Is it possible to still use Stan to achieve the same fit as if I had y? I’m a novice to Stan, and I’ve looked through many examples and the official documentation, but can’t seem to find anything that mentions this problem.

Thanks in advance!



It is basically a missing data problem with constraints given by the sample mean and the sample standard deviation.


Thanks for the reply! Should I simulate the data using these constraints? Then use my Stan code on the simulated data? Or is there a way to modify the Stan code to do this for me?


I would declare the values in the parameters block to be a simplex of size n. Then use your sample mean and sample standard deviation to transform the simplex into something that can be described by a lognormal density. Make sure to adjust by the log of the absolute value of the determinant of the Jacobian of the transformation though.


If you can get the mean and sd of log y rather than of y, this would be easy. Otherwise, the non-linearity makes the constraint in the missing data problem tricky to code whether you try to impute log y or y from sd(y) and mean(y) and n, where y > 0.

Also, those wide priors are not helping matters.