Help with nonlinear model


#1

Hi, I am working on a project, and I wanted to see if I could recover the parameters of a simulated data. I would like to recover pref and chi from the simulated data below. However, I get completely unreasonable estimates, and I get many warnings about max tree depth being too low and that the adapt_delta parameter should be higher. I increased the max tree depth and the adapt_delta control parameters, but nothing seemed to work. I would really appreciate any suggestions! (Oddly, this same model does give me very accurate parameter estimates in jags).

#stan - precip model

#precip model #NO LOG TRANSFORM
zref = 150
#chi = 0.000250
chi = 0.000250

#station elevation
zsta=rlnorm(1000, meanlog = 6, sd=.5)

x = zsta - zref

#reference value for precipitation
pref = 2

err = rnorm(1000, 0, .01)
y = pref*((1+chi*(x)) / (1 - chi*(x))) + err
N = length(y)
hist(y)

precipdata = as.data.frame(cbind(y,x))

#stan model
"
data {
int <lower=1> N;
vector [N] y;
vector [N] x;
}
parameters {
real <lower=0> sigma;
real chi;
real <lower=0> pref;
}
model {
vector [N] denom;
for (i in 1:N) denom[i] = 1 / (1 - chix[i]);
y ~ normal(pref * (1 + chi
x) .* denom, sigma);
sigma ~ normal(0, 10);
chi ~ normal(0, .0001);
pref ~ uniform(0, 10);
}
"

fileName <- "stanprecip.txt"
stan_code2 <- readChar(fileName, file.info(fileName)$size)
cat(stan_code2)stanprecip.txt (391 Bytes)

data for Stan

precipList <- list(N = N, y = precipdata$y, x = precipdata$x)

sample from posterior

precipstan <- stan(model_code = stan_code2, data = precipList,
chains = 3, iter = 30000, warmup = 500, thin = 10)


#2

Your three parameters have orders of magnitude different scales from each other. This puts more strain on the warmup phase in Stan. One solution could be to rescale two of the parameters. For instance, work with chi10000 = chi * 10000 as a parameter. Another solution is to run with more warmup iteration or give initial values so that Stan can find the different scales quickly. (You probably don’t need 30000 iterations or thinning; iter=2000, warmup=1000 is a good default).

Furthermore, your prior for sigma is very vague (true value = .01). If you have a uniform prior, on pref you need to declare real <lower=0, upper=10> pref. I would not be surprised if that is the main culprit for a lot of your problems. Also, the prior on chi is very tight, the true value is 2.5 standard deviations away of the mean of the normal prior. I can see how the phi samples are pushed lower than the true value and how that has weird effects on the other parameters especially pref because of the non-linearity.


#3

stanprecip.txt (422 Bytes)

Hi,

Thank you so much for getting back to me about my stan model. I tried to implement the changes suggested, but I am still getting very strange results. Again, I am trying to recover parameters from simulation data so I can make sure my code is working. I have included the R code for the simulation data, and I have attached the stan code. I tried scaling parameters, as suggested, and that had no effect. I tried different priors for chi, but that did not help either. Would you have any other suggestions? Thank you very much!

#stan - precip model

#precip model
#station reference elevation
zref = 150

#chi = 0.000250 (original value)
#chi = 0.000250

#try multiplying chi by 100…does not help
chi100 = 0.025

#station elevation
#try standardizing…does not seem to help

zsta=rlnorm(1000, meanlog = 6, sd=.5)
x = zsta - zref
x = (x-mean(x)) / sd(x)

#reference value for precipitation
pref = 2

err = rnorm(1000, 0, .01)
y = pref*((1+chi100*(x)) / (1 - chi100*(x))) + err
#y = pref*((1+chi100*(x)) / (1 - chi100*(x))) + err

N = length(y)
hist(y)
plot(x, y)

precipdata = as.data.frame(cbind(y,x))
fileName <- "stanprecip.txt"
stan_code2 <- readChar(fileName, file.info(fileName)$size)
cat(stan_code2)

data for Stan

precipList <- list(N = N, y = precipdata$y, x = precipdata$x)

sample from posterior

precipstan <- stan(model_code = stan_code2, data = precipList,
chains = 3, iter = 5000, warmup = 1000, thin = 10)


#4

Can you explain what the weird results are? Are the chains converging? Do you get numerical or divergence warnings? Or are the results different from the simulation parameters?


#5

Hi again,

I have received warning like this every time I’ve run it:

Warning messages:
1: There were 176 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
http://mc-stan.org/misc/warnings.html#bfmi-low
3: Examine the pairs() plot to diagnose sampling problems

Maike Holthuijzen
Ph.D Graduate Student
University of Vermont
cell: 208-830-0270

Nothing in life is to be feared, it is only to be understood - Marie Curie


#6

If chi * x is around 1, you will get weird behavior. y --> +inf for chi * x = .999999 and y --> - inf for chi * x = 1.00001. So the likelihood is not differentiable around 1, which makes it hard for Stan to explore values where chi * x ~= 1. If you expect that chi * x should be smaller than 1, it would be wise to use that information in your prior. Your current prior implies that chi = 10 is a likely value.


#7

Hi again,

I did check to make sure I was not getting very large positive or negative values for the response; I checked the chi*x values as well, because yes, that would lead to problems with differentiation of the likelihood. I tried it again just to make sure, but I keep getting the same error message.

What would you suggest as appropriate priors for chi/pref? I thought I was specifying vague priors with normal(0,10)…but maybe that’s not the case? Should I be more specific? I tried these priors as well:

sigma ~ normal(0, 2);
chi ~ normal(0, 1);
pref ~ normal(0, 10);

but that did not do much to help with strange results either.

Finally, I tried these priors (are these in line with your suggestion of a ‘better’ prior?)

sigma ~ normal(.1, 4);  //my error terms were drawn from a normal dist. w/ mean=0.1
chi ~ normal(.2, 5);  //real value of chi is 0.25
pref ~ normal(2, 4); //real value is 2

That helped a bit…but the estimates were still fairly far off (chi = .9, sigma = .2, pref = 1.49)
So…making my priors more specific seems to help, which makes sense. Although making my priors more specific for simulation data is easy, I would not be able to give very specific priors for real data…

Let me know if you have any more suggestions. I’m sorry if these are basic questions, but I have been doing frequentist analyses for more than 10 years, and I am just beginning to transition to Bayesian analyses. Thanks again!

Maike Holthuijzen
Ph.D Graduate Student
University of Vermont
cell: 208-830-0270

Nothing in life is to be feared, it is only to be understood - Marie Curie


#8

Those priors for sigma and chi are still quite uninformative/very weakly informative. See the Stan team’s recommendation. It’s more about the standard deviation than the mean in your normal.

You can plot your prior in R to see that. For instance the chi prior:

prior_chi = rnorm(10000, 0.2, 5)
hist(prior_chi)

You probably need to use theory and prior research to determine what reasonable values are for your parameters. Given that x comes from a scaled lognormal and you want to keep chi * x below 1, I would expect chi to be probably below .2, which would imply a prior like chi ~ normal(0, .1). I don’t know the substantive issue so I might be entirely wrong but this is how I go about setting priors in these kind of nonlinear models.

It’s probably better to look at the 50% credibility intervals. If 50% of true values, fall into the intervals, you are good to go.


#9

That’s a sign that you’re running into regions of the posterior where the numerical Hamiltonian breaks down. It can lead to biased samples. Usually you need to reparameterize. This is often caused by non-identifiability or by bad curvature in the posterior (often from centered parameterizations of hierarchical models).

Stan uses standard deviation and having a parameter with a scale this small is going to cause problems for adaptation. Are you trying to model something deterministic here? That’s usually not a good idea. If you really have variation at that level, it might help to use

chi_raw ~ normal(0, 1);
real chi = chi_raw * 1e4;

because then the parameter being sampled is back on the unit scale in the prior.