MCMC sampling does not work when execute

Hi everyone, I currently have trouble executing the below stan model, but I could not detect why it keeps giving me this error message in order to fix my code. Could someone please help me resolve this issue? I got stuck on this problem for a couple of months already:((

“Error in new_CppObject_xp(fields$.module, fields$.pointer, …) :
mismatch in dimension declared and found in context; processing stage=data initialization; variable name=epsilon; position=0; dims declared=(16); dims found=(24)”

Stan Model, saved as stamodeling.stan"

data {
  int<lower=1> ncol;
  int<lower=1> nrow;
  vector[ncol] yH;
  matrix[nrow, ncol] A;
  vector[ncol] sigma_x;
  vector[ncol] sigma_y;
  vector[nrow] epsilon;
}

parameters {
   vector[ncol] yT;
}

model {
  vector[nrow] xT;
  vector[nrow] x;
  //priors
  yT ~ normal(yH, sigma_y);

  #xT = A*yT + epsilon; //creation of linear constraint

   //likelihood
   x ~ normal(A*yT+epsilon, sigma_x);
}

Execution in R

rstan_options(auto_write = TRUE)
library(rstan)
library(MASS)
library(truncnorm)

nrow = 16;
ncol = 24;
iterations = 50000;
A <- matrix(c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_A^1 - Done
         0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_A^2 - Done
         0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_B^1 - Done
         1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_B^2 - Done
         0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_C^1 - Done
         0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0, #x_C^2 - Done
         0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0, #x_D^1 - Done
         0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_D^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0, #x_E^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,0, #x_E^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0, #x_F^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0, #x_F^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0, #x_G^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1, #x_G^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0, #x_H^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1), #x_H^2 - Done
        nrow, ncol, byrow= TRUE);

lambda = array(sample.int(100, size = ncol, replace = TRUE),ncol);
yH <- rtruncnorm(ncol,a = 0, b = lambda+1, mean = 0, sd = 1);
sigma_x <- array(0.2, ncol);
sigma_y <- array(0.3, ncol);
epsilon <- array(0.1, ncol);

stanmodel1 <- stan_model(file = "stamodeling.stan",
                  model_name = "stanmodel1");
stanfit <- sampling(stanmodel1, data = list(ncol = ncol,nrow = nrow,
                                            yH = yH, 
                                            A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                    ,iter=iterations, warmup = 1000, chains = 3, cores = 1);

A follow-up question: If I want to use MCMC as a sampling method, how does the syntax for stanfit change?

Check the shape for sigma_x.

In normal(A*yT+epsilon, sigma_x)

A*yT+epsilon --> nrow
sigma_x --> ncol

1 Like

Thank you Sir!! I fixed that line vector[nrow] sigma_x, as well as the initialization line epsilon <- array(0.1, ncol). However, now I encounter another type of error message, which is still at the first sampling step:

Rejecting initial value:
Error evaluating the log probability at the initial value.
Exception: normal_lpdf: Random variable[1] is nan, but must not be nan! (in ‘model23728163056_stanmodel1’ at line 24)

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”

I tried changing the value of epsilon to be greater than sigma_x (set epsilon = 0.5 rather than 0.1), and re-compile the program. Same error;p

Could anyone on here please help me with this problem? I am not sure why the initialization of the sampling starts from -2 rather than 0? How could we adjust the bounds, does anyone know?

That is -2 on the unconstrained scale, which might be too extreme sometimes, but in your case you are referencing x before defining it. See the first entry of

1 Like

That seems to be true for me. So should I put `vector[nrow] x’ in the parameters block of the Stan model then? Do you think the main problem came from how I setup the model, not from the code itself?

If you do not have data on x, then yes it should be in the parameters block.

It finally works!!! Oh man, this made my day:) Thank you so much!!!

I have some more questions, could you help please?

  1. Do you know how I could fit the MCMC sampling, rather than the pure sampling above?
  2. How I could plot the posterior distribution from the object stanfit?

That is MCMC sampling.

https://cran.r-project.org/web/packages/bayesplot/vignettes/

Thank you so much for your help, Sir! I read the help(stan) function, and the default value of the type of sampling does not include MCMC. Instead, stan() function refers the sampling type through the parameter c = ("NUTS", "HMC", "Fixed_param"). The default is NUTS, which is not what I want, so I changed it to c=("HMC"), but this is not MCMC either. I had to use stan() function to create the object for stan as follows:

stanfit2 <- stan(file = "stamodeling.stan", data = list(ncol = ncol,nrow = nrow,
                                        yH = yH, 
                                        A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                ,iter=iterations, warmup = 5000, chains = 4, cores = 2, refresh = 2500);

However, the problem arises when I tried using traceplot() function with the following command:

traceplot(stanfit2, pars = c("yT"), inc_warmup = TRUE)

No error message was given, but no graph shows either! Even though my Mac computer runs like crazy (I could hear the sound of the fan start kicking in…). After a while, my program R freezes.

My question: Should I use MCMCpoisson() function? And is the following syntax OK?

stanfit2 <- MCMCpoisson(file = "stamodeling.stan", data = list(ncol = ncol,nrow = nrow,
                                        yH = yH, 
                                        A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                ,iter=iterations, warmup = 5000, chains = 4, cores = 2, refresh = 2500);

MCMC means Markov Chain Monte Carlo. There are multiple ways to do MCMC (different samplers).

You have probably heard Gibbs sampler / Metropolis Hastings.

HMC, Hybrid / Hamiltonian Monte Carlo is one way to do MCMC. NUTS is an efficient automatic HMC algorithm, see the NUTS paper.


https://arxiv.org/abs/1111.4246

Notice that your iter is 50k (4 chains = total of 200k draws).That is normal amount of draws one takes when using inefficient algorithm. With NUTS/HMC you can use much smaller number of draws. Start with the default warm-up=1000/iter=1000=2000 and see if that is enough.

See example for workflow here.

1 Like

Thank you so much for your information, Sir! I tried with warm-up = 1000, iter = 1000 for both the stan() and sampling() functions, and I got the error message after Chain 4:

Error: is.atomic(x) is not TRUE.

I changed the number of iterations to 2000, and it worked!! Is the error because warm-up cannot be equal to number of iterations?

Another problem arises: when I tried using traceplot(stanfit, pars = c("yT"), inc_warmup = TRUE) for the object stanfit above (the one that is feeded by sampling()), I got an error message:

Error in mcmc.list(x) : Arguments must be mcmc objects

I am not sure why, because the sampling model used, by default, of sampling() function is NUTS, so stanfit should already be a mcmc object(?) I just try to obtain the graph of the posterior distribution as a continuous curve, rather than a histogram.

Oh, yes

warm-up + num-of-samples = iter. With default iter/2=warm-up.

See manual for more information

http://mc-stan.org/users/documentation/

1 Like

A stanfit object is not a mcmc object as defined by the coda package. You should be using the mcmc_trace function in the bayesplot package, but if you insist on using coda then you need to coerce the stanfit object to a mcmc.list object using the As.mcmc.list function.

1 Like

Thank you so much for your help, once again! I tried following your hint by using the following command:

mcmc_trace(as.mcmc.list(stanfit), pars = character("yT"))

However, I still got the error message: Error in mcmc.list(x) : Arguments must be mcmc objects. I tried keeping only mcmc_trace(as.mcmc.list(stanfit), but it stills fail with the same error. So confused;p

As.mcmc.list needs the capital at the start.

1 Like

If you are calling bayesplot functions, then it should just be mcmc_trace(as.array(stanfit), pars = "yT"). If you are calling coda functions, it should be trace(As.mcmc.list(stanfit)).

1 Like

I tried this, but then s got the error: Error in select_parameters(explicit = pars, patterns = regex_pars, complete = parnames) : Some 'pars' don't match parameter names: yT

Changed it to c("yT"), still got that error.

For the coda functions, I got this error, which I have no idea where it came from:

trace(As.mcmc.list(stanfit))
Error in .primTrace(what) : argument must be a function

Thank you for your help!! I tried it out, but got the error:

  trace(As.mcmc.list(stanfit))
  Error in .primTrace(what) : argument must be a function

Did you rename your parameters? Anyway, maybe just call it with mcmc_trace(as.array(stanfit)).

1 Like