Hi all, I was wondering how do I deal with the unknown constant parameters in rstan. like in my model, I have three unknown constant parameters (log(A) or A, B and delta_H) which need to be estimated from the sampling. Thanks in advance!
Here are the parameters block:
// The parameters accepted by the model.
parameters {
// Define parameters to estimate
// Random effect
matrix[N,2] mat;
// Level-1
real<lower=1> sigma;
// Hyperparameters
vector[2] mu_mat;
real<lower=100> A[(N-1)*5+K];
real<lower=1> B[(N-1)*5+K];
real<lower=1> delta_H[(N-1)*5+K];
corr_matrix[2] sigma_mat;
corr_matrix[2] sigma_for_prior;
}
//Parameters processing before the postier is computed
transformed parameters {
// Random effect
real beta0[N];
real gamma[N];
//parameters
row_vector[(N-1)*5+K] beta1;
row_vector[(N-1)*5+K] mu;
for(n in 1:N){
beta0[n] = mat[n,1];
gamma[n] = mat[n,2];
}
// Population slope
for(k in 1:K){
for(n in 1:N){
beta1[(n-1)*5+k] = exp(log(A[(N-1)*5+K]) + B[(N-1)*5+K]*x1[(n-1)*5+k] + delta_H[(N-1)*5+K]*x2[(n-1)*5+k]);
mu[(n-1)*5+k] = beta0[n] + beta1[(n-1)*5+k] * pow(t_ijk[(n-1)*5+k],gamma[n]);
}
I also encountered the same problem. If constant chain includes, then R hat was NaN and
diagnostic functions such as check_Rhat was disturbed. Thus, I excluded such constant chains from the Stan code. This treatment made my Stan code be more redundant. I also want to know how to include such constant chains. I some context, Stan code is simpler if it can include such constant chains.
Hi Jean, I agree. My problem shows the estimation of such parameters were not affected by the stan model. They keep as the same when I declared them in the parameters block, even though I randomly assign them an initial value.
Here is the stan output and trace plots:
Sure. Here is the .stan file. Thank you for taking a look on my code.
// The input data.
data {
// Define variables in data
// Number of level-1 observations (an integer)
int<lower=1> N; //number of units
// Number of level-2 clusters
int<lower=1> K; //number of conditions
// RH from the data
real RH[(N-1)*5+K];
// Temp from the data
int<lower=1> T[(N-1)*5+K];
// Continuous outcome
vector[(N-1)*5+K] y_ijk;//number of the outputs
// Continuous predictor
real t_ijk[(N-1)*5+K];
}
//Prepocessing of the data
transformed data{
real x1[(N-1)*5+K];
real x2[(N-1)*5+K];
for(k in 1:K){
for(n in 1:N){
x1[(n-1)*5+k] = log(RH[(n-1)*5+k]);
x2[(n-1)*5+k] = 11605/(T[(n-1)*5+k]+273.15);
// print("x1:", x1);
}
}
}
// The parameters accepted by the model.
parameters {
// Define parameters to estimate
// Random effect
matrix[N,2] mat;
// Level-1
real<lower=1> sigma;
// Hyperparameters
vector[2] mu_mat;
real<lower=100> A;
real<lower=1> B;
real<lower=1> delta_H;
corr_matrix[2] sigma_mat;
corr_matrix[2] sigma_for_prior;
}
//Parameters processing before the postier is computed
transformed parameters{
// Random effect
real beta0;
real gamma;
row_vector[(N-1)*5+K] beta1;
row_vector[(N-1)*5+K] mu;
beta0 = mu_mat[1];
gamma = mu_mat[2];
// Population slope
for(k in 1:K){
for(n in 1:N){
beta1[(n-1)*5+k] = exp(log(A) + B*x1[(n-1)*5+k] + delta_H*x2[(n-1)*5+k]);
mu[(n-1)*5+k] = beta0 + beta1[(n-1)*5+k] * pow(t_ijk[(n-1)*5+k],gamma);
}
}
}
model {
// mu ~ cauchy(0,1e-3);
sigma ~ gamma(1e-3,1e-3);
mu_mat[1] ~ normal(0,1e-3);
mu_mat[2] ~ normal(0,1e-3);
A ~ normal(0,1e-3);
B ~ normal(0, 1e-3);
delta_H ~ normal(0,1e-3);
mat[2] ~ multi_normal(mu_mat,sigma_mat);
sigma_mat ~ wishart(3,sigma_for_prior);
sigma_for_prior ~ lkj_corr(1);
for (k in 1:K){
for (n in 1:N){
y_ijk[(n-1)*5+k] ~ lognormal(mu[(n-1)*5+k], sigma);
}
}
}
Hey yes, I think I will simply remove mat. since I need stan to output the value of mu_mat. So it would be better if I just put mu_mat ~ multi_normal with zero mean and sigma_mat.
To add to what @Christopher-Peterson said (which I agree with) in Stan the normal distribution uses the standard deviation (not the precision) so these priors basically force the parameters to be constant.
Hi Jonah, the reason I gave them the normal distribution is I was trying to give them non-informative priors. The output didn’t look good. Any ideas for the priors? Thank you.
You could do that, but then your priors would give you posteriors that were tightly concentrated around 0. I’m guessing this isn’t what you want.
Your normal(0, 0.001) priors are highly informative, with 99% of the prior probability density between -0.003 and +0.003. What are you trying to do with these parameters?
Hi Christopher, the data I used is from an acceleration degradation test. I am trying to use the degradation data to draw the posteriors sample with A, B and delta_H,etc, so that I can fit the estimation into the degradation model and predict the median lifetime.
I am using normal priors in my rjags model, so I didn’t change when I wrote Stan code. Are there any common priors which are non-informative to use in stan for the constant parameters?
Thank you very much!
That’s the issue I think. Jags uses a different parameterization of the normal. In JAGS normal(0, 0.001) (or dnorm I guess) means normal with a precision (1/variance) of 0.001. In Stan the second argument to the normal distribution is a standard deviation (sqrt(variance)). So those are very different! If you want the exact same prior in Stan then you’d need to change normal(0, 0.001) to normal(0, sqrt(1 / 0.001)) or just normal(0, sqrt(1000)). I would give that a try and hopefully it will fix the problem with the constants!