MCMC sampling does not work when execute

fitting-issues

#1

Hi everyone, I currently have trouble executing the below stan model, but I could not detect why it keeps giving me this error message in order to fix my code. Could someone please help me resolve this issue? I got stuck on this problem for a couple of months already:((

“Error in new_CppObject_xp(fields$.module, fields$.pointer, …) :
mismatch in dimension declared and found in context; processing stage=data initialization; variable name=epsilon; position=0; dims declared=(16); dims found=(24)”

Stan Model, saved as stamodeling.stan"

data {
  int<lower=1> ncol;
  int<lower=1> nrow;
  vector[ncol] yH;
  matrix[nrow, ncol] A;
  vector[ncol] sigma_x;
  vector[ncol] sigma_y;
  vector[nrow] epsilon;
}

parameters {
   vector[ncol] yT;
}

model {
  vector[nrow] xT;
  vector[nrow] x;
  //priors
  yT ~ normal(yH, sigma_y);

  #xT = A*yT + epsilon; //creation of linear constraint

   //likelihood
   x ~ normal(A*yT+epsilon, sigma_x);
}

Execution in R

rstan_options(auto_write = TRUE)
library(rstan)
library(MASS)
library(truncnorm)

nrow = 16;
ncol = 24;
iterations = 50000;
A <- matrix(c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_A^1 - Done
         0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_A^2 - Done
         0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_B^1 - Done
         1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_B^2 - Done
         0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_C^1 - Done
         0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0, #x_C^2 - Done
         0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0, #x_D^1 - Done
         0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_D^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0, #x_E^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,0, #x_E^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0, #x_F^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0, #x_F^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0, #x_G^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1, #x_G^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0, #x_H^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1), #x_H^2 - Done
        nrow, ncol, byrow= TRUE);

lambda = array(sample.int(100, size = ncol, replace = TRUE),ncol);
yH <- rtruncnorm(ncol,a = 0, b = lambda+1, mean = 0, sd = 1);
sigma_x <- array(0.2, ncol);
sigma_y <- array(0.3, ncol);
epsilon <- array(0.1, ncol);

stanmodel1 <- stan_model(file = "stamodeling.stan",
                  model_name = "stanmodel1");
stanfit <- sampling(stanmodel1, data = list(ncol = ncol,nrow = nrow,
                                            yH = yH, 
                                            A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                    ,iter=iterations, warmup = 1000, chains = 3, cores = 1);

A follow-up question: If I want to use MCMC as a sampling method, how does the syntax for stanfit change?


#2

Check the shape for sigma_x.

In normal(A*yT+epsilon, sigma_x)

A*yT+epsilon --> nrow
sigma_x --> ncol


#3

Thank you Sir!! I fixed that line vector[nrow] sigma_x, as well as the initialization line epsilon <- array(0.1, ncol). However, now I encounter another type of error message, which is still at the first sampling step:

Rejecting initial value:
Error evaluating the log probability at the initial value.
Exception: normal_lpdf: Random variable[1] is nan, but must not be nan! (in ‘model23728163056_stanmodel1’ at line 24)

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”

I tried changing the value of epsilon to be greater than sigma_x (set epsilon = 0.5 rather than 0.1), and re-compile the program. Same error;p


#4

Could anyone on here please help me with this problem? I am not sure why the initialization of the sampling starts from -2 rather than 0? How could we adjust the bounds, does anyone know?


#5

That is -2 on the unconstrained scale, which might be too extreme sometimes, but in your case you are referencing x before defining it. See the first entry of


#6

That seems to be true for me. So should I put `vector[nrow] x’ in the parameters block of the Stan model then? Do you think the main problem came from how I setup the model, not from the code itself?


#7

If you do not have data on x, then yes it should be in the parameters block.


#8

It finally works!!! Oh man, this made my day:) Thank you so much!!!

I have some more questions, could you help please?

  1. Do you know how I could fit the MCMC sampling, rather than the pure sampling above?
  2. How I could plot the posterior distribution from the object stanfit?

#9

That is MCMC sampling.

https://cran.r-project.org/web/packages/bayesplot/vignettes/


#10

Thank you so much for your help, Sir! I read the help(stan) function, and the default value of the type of sampling does not include MCMC. Instead, stan() function refers the sampling type through the parameter c = ("NUTS", "HMC", "Fixed_param"). The default is NUTS, which is not what I want, so I changed it to c=("HMC"), but this is not MCMC either. I had to use stan() function to create the object for stan as follows:

stanfit2 <- stan(file = "stamodeling.stan", data = list(ncol = ncol,nrow = nrow,
                                        yH = yH, 
                                        A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                ,iter=iterations, warmup = 5000, chains = 4, cores = 2, refresh = 2500);

However, the problem arises when I tried using traceplot() function with the following command:

traceplot(stanfit2, pars = c("yT"), inc_warmup = TRUE)

No error message was given, but no graph shows either! Even though my Mac computer runs like crazy (I could hear the sound of the fan start kicking in…). After a while, my program R freezes.

My question: Should I use MCMCpoisson() function? And is the following syntax OK?

stanfit2 <- MCMCpoisson(file = "stamodeling.stan", data = list(ncol = ncol,nrow = nrow,
                                        yH = yH, 
                                        A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                ,iter=iterations, warmup = 5000, chains = 4, cores = 2, refresh = 2500);

#11

MCMC means Markov Chain Monte Carlo. There are multiple ways to do MCMC (different samplers).

You have probably heard Gibbs sampler / Metropolis Hastings.

HMC, Hybrid / Hamiltonian Monte Carlo is one way to do MCMC. NUTS is an efficient automatic HMC algorithm, see the NUTS paper.


https://arxiv.org/abs/1111.4246

Notice that your iter is 50k (4 chains = total of 200k draws).That is normal amount of draws one takes when using inefficient algorithm. With NUTS/HMC you can use much smaller number of draws. Start with the default warm-up=1000/iter=1000=2000 and see if that is enough.

See example for workflow here.


#12

Thank you so much for your information, Sir! I tried with warm-up = 1000, iter = 1000 for both the stan() and sampling() functions, and I got the error message after Chain 4:

Error: is.atomic(x) is not TRUE.

I changed the number of iterations to 2000, and it worked!! Is the error because warm-up cannot be equal to number of iterations?

Another problem arises: when I tried using traceplot(stanfit, pars = c("yT"), inc_warmup = TRUE) for the object stanfit above (the one that is feeded by sampling()), I got an error message:

Error in mcmc.list(x) : Arguments must be mcmc objects

I am not sure why, because the sampling model used, by default, of sampling() function is NUTS, so stanfit should already be a mcmc object(?) I just try to obtain the graph of the posterior distribution as a continuous curve, rather than a histogram.


#13

Oh, yes

warm-up + num-of-samples = iter. With default iter/2=warm-up.

See manual for more information

http://mc-stan.org/users/documentation/


#14

A stanfit object is not a mcmc object as defined by the coda package. You should be using the mcmc_trace function in the bayesplot package, but if you insist on using coda then you need to coerce the stanfit object to a mcmc.list object using the As.mcmc.list function.


#15

Thank you so much for your help, once again! I tried following your hint by using the following command:

mcmc_trace(as.mcmc.list(stanfit), pars = character("yT"))

However, I still got the error message: Error in mcmc.list(x) : Arguments must be mcmc objects. I tried keeping only mcmc_trace(as.mcmc.list(stanfit), but it stills fail with the same error. So confused;p


#16

As.mcmc.list needs the capital at the start.


#17

If you are calling bayesplot functions, then it should just be mcmc_trace(as.array(stanfit), pars = "yT"). If you are calling coda functions, it should be trace(As.mcmc.list(stanfit)).


#18

I tried this, but then s got the error: Error in select_parameters(explicit = pars, patterns = regex_pars, complete = parnames) : Some 'pars' don't match parameter names: yT

Changed it to c("yT"), still got that error.

For the coda functions, I got this error, which I have no idea where it came from:

trace(As.mcmc.list(stanfit))
Error in .primTrace(what) : argument must be a function


#19

Thank you for your help!! I tried it out, but got the error:

  trace(As.mcmc.list(stanfit))
  Error in .primTrace(what) : argument must be a function

#20

Did you rename your parameters? Anyway, maybe just call it with mcmc_trace(as.array(stanfit)).