MCMC sampling does not work when execute

ghjk · September 17, 2017, 6:57am

Hi everyone, I currently have trouble executing the below stan model, but I could not detect why it keeps giving me this error message in order to fix my code. Could someone please help me resolve this issue? I got stuck on this problem for a couple of months already:((

“Error in new_CppObject_xp(fields$.module, fields$.pointer, …) :
mismatch in dimension declared and found in context; processing stage=data initialization; variable name=epsilon; position=0; dims declared=(16); dims found=(24)”

Stan Model, saved as stamodeling.stan"

data {
  int<lower=1> ncol;
  int<lower=1> nrow;
  vector[ncol] yH;
  matrix[nrow, ncol] A;
  vector[ncol] sigma_x;
  vector[ncol] sigma_y;
  vector[nrow] epsilon;
}

parameters {
   vector[ncol] yT;
}

model {
  vector[nrow] xT;
  vector[nrow] x;
  //priors
  yT ~ normal(yH, sigma_y);

  #xT = A*yT + epsilon; //creation of linear constraint

   //likelihood
   x ~ normal(A*yT+epsilon, sigma_x);
}

Execution in R

rstan_options(auto_write = TRUE)
library(rstan)
library(MASS)
library(truncnorm)

nrow = 16;
ncol = 24;
iterations = 50000;
A <- matrix(c(1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_A^1 - Done
         0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_A^2 - Done
         0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_B^1 - Done
         1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_B^2 - Done
         0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_C^1 - Done
         0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0, #x_C^2 - Done
         0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0, #x_D^1 - Done
         0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, #x_D^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0, #x_E^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,0, #x_E^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0, #x_F^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0, #x_F^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0, #x_G^2 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1, #x_G^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0, #x_H^1 - Done
         0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1), #x_H^2 - Done
        nrow, ncol, byrow= TRUE);

lambda = array(sample.int(100, size = ncol, replace = TRUE),ncol);
yH <- rtruncnorm(ncol,a = 0, b = lambda+1, mean = 0, sd = 1);
sigma_x <- array(0.2, ncol);
sigma_y <- array(0.3, ncol);
epsilon <- array(0.1, ncol);

stanmodel1 <- stan_model(file = "stamodeling.stan",
                  model_name = "stanmodel1");
stanfit <- sampling(stanmodel1, data = list(ncol = ncol,nrow = nrow,
                                            yH = yH, 
                                            A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                    ,iter=iterations, warmup = 1000, chains = 3, cores = 1);

A follow-up question: If I want to use MCMC as a sampling method, how does the syntax for stanfit change?

ahartikainen · September 17, 2017, 8:52am

Check the shape for sigma_x.

In normal(A*yT+epsilon, sigma_x)

A*yT+epsilon --> nrow
sigma_x --> ncol

ghjk · September 17, 2017, 3:57pm

Thank you Sir!! I fixed that line vector[nrow] sigma_x, as well as the initialization line epsilon <- array(0.1, ncol). However, now I encounter another type of error message, which is still at the first sampling step:

Rejecting initial value:
Error evaluating the log probability at the initial value.
Exception: normal_lpdf: Random variable[1] is nan, but must not be nan! (in ‘model23728163056_stanmodel1’ at line 24)

Initialization between (-2, 2) failed after 100 attempts.
Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] “Error in sampler$call_sampler(args_list[[i]]) : Initialization failed.”

I tried changing the value of epsilon to be greater than sigma_x (set epsilon = 0.5 rather than 0.1), and re-compile the program. Same error;p

ghjk · September 18, 2017, 2:15am

Could anyone on here please help me with this problem? I am not sure why the initialization of the sampling starts from -2 rather than 0? How could we adjust the bounds, does anyone know?

bgoodri · September 18, 2017, 2:20am

That is -2 on the unconstrained scale, which might be too extreme sometimes, but in your case you are referencing x before defining it. See the first entry of

ghjk · September 18, 2017, 2:57am

That seems to be true for me. So should I put `vector[nrow] x’ in the parameters block of the Stan model then? Do you think the main problem came from how I setup the model, not from the code itself?

bgoodri · September 18, 2017, 3:03am

If you do not have data on x, then yes it should be in the parameters block.

ghjk · September 18, 2017, 3:15am

It finally works!!! Oh man, this made my day:) Thank you so much!!!

I have some more questions, could you help please?

Do you know how I could fit the MCMC sampling, rather than the pure sampling above?
How I could plot the posterior distribution from the object stanfit?

bgoodri · September 18, 2017, 4:02am

That is MCMC sampling.

https://cran.r-project.org/web/packages/bayesplot/vignettes/

ghjk · September 18, 2017, 5:28am

Thank you so much for your help, Sir! I read the help(stan) function, and the default value of the type of sampling does not include MCMC. Instead, stan() function refers the sampling type through the parameter c = ("NUTS", "HMC", "Fixed_param"). The default is NUTS, which is not what I want, so I changed it to c=("HMC"), but this is not MCMC either. I had to use stan() function to create the object for stan as follows:

stanfit2 <- stan(file = "stamodeling.stan", data = list(ncol = ncol,nrow = nrow,
                                        yH = yH, 
                                        A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                ,iter=iterations, warmup = 5000, chains = 4, cores = 2, refresh = 2500);

However, the problem arises when I tried using traceplot() function with the following command:

traceplot(stanfit2, pars = c("yT"), inc_warmup = TRUE)

No error message was given, but no graph shows either! Even though my Mac computer runs like crazy (I could hear the sound of the fan start kicking in…). After a while, my program R freezes.

My question: Should I use MCMCpoisson() function? And is the following syntax OK?

stanfit2 <- MCMCpoisson(file = "stamodeling.stan", data = list(ncol = ncol,nrow = nrow,
                                        yH = yH, 
                                        A = A, sigma_x = sigma_x, sigma_y = sigma_y, epsilon = epsilon)
                ,iter=iterations, warmup = 5000, chains = 4, cores = 2, refresh = 2500);

ahartikainen · September 18, 2017, 7:11am

MCMC means Markov Chain Monte Carlo. There are multiple ways to do MCMC (different samplers).

You have probably heard Gibbs sampler / Metropolis Hastings.

HMC, Hybrid / Hamiltonian Monte Carlo is one way to do MCMC. NUTS is an efficient automatic HMC algorithm, see the NUTS paper.

https://arxiv.org/abs/1111.4246

Notice that your iter is 50k (4 chains = total of 200k draws).That is normal amount of draws one takes when using inefficient algorithm. With NUTS/HMC you can use much smaller number of draws. Start with the default warm-up=1000/iter=1000=2000 and see if that is enough.

See example for workflow here.

github.com

betanalpha/knitr_case_studies/blob/master/rstan_workflow/rstan_workflow.Rmd

---
title: "Robust Statistical Workflow with RStan"
author: "Michael Betancourt"
date: "July 2017"
bibliography: rstan_workflow.bib
output:
  html_document:
    fig_caption: yes
    theme: spacelab #sandstone #spacelab #flatly
    highlight: pygments
    toc: TRUE
    toc_depth: 2
    number_sections: TRUE
    toc_float:
      smooth_scroll: FALSE
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(comment=NA)
```

This file has been truncated. show original

ghjk · September 18, 2017, 4:33pm

Thank you so much for your information, Sir! I tried with warm-up = 1000, iter = 1000 for both the stan() and sampling() functions, and I got the error message after Chain 4:

Error: is.atomic(x) is not TRUE.

I changed the number of iterations to 2000, and it worked!! Is the error because warm-up cannot be equal to number of iterations?

Another problem arises: when I tried using traceplot(stanfit, pars = c("yT"), inc_warmup = TRUE) for the object stanfit above (the one that is feeded by sampling()), I got an error message:

Error in mcmc.list(x) : Arguments must be mcmc objects

I am not sure why, because the sampling model used, by default, of sampling() function is NUTS, so stanfit should already be a mcmc object(?) I just try to obtain the graph of the posterior distribution as a continuous curve, rather than a histogram.

ahartikainen · September 18, 2017, 4:40pm

Oh, yes

warm-up + num-of-samples = iter. With default iter/2=warm-up.

See manual for more information

http://mc-stan.org/users/documentation/

bgoodri · September 18, 2017, 4:46pm

A stanfit object is not a mcmc object as defined by the coda package. You should be using the mcmc_trace function in the bayesplot package, but if you insist on using coda then you need to coerce the stanfit object to a mcmc.list object using the As.mcmc.list function.

ghjk · September 20, 2017, 7:36am

Thank you so much for your help, once again! I tried following your hint by using the following command:

mcmc_trace(as.mcmc.list(stanfit), pars = character("yT"))

However, I still got the error message: Error in mcmc.list(x) : Arguments must be mcmc objects. I tried keeping only mcmc_trace(as.mcmc.list(stanfit), but it stills fail with the same error. So confused;p

stijn · September 20, 2017, 8:11am

As.mcmc.list needs the capital at the start.

bgoodri · September 20, 2017, 12:13pm

If you are calling bayesplot functions, then it should just be mcmc_trace(as.array(stanfit), pars = "yT"). If you are calling coda functions, it should be trace(As.mcmc.list(stanfit)).

ghjk · September 20, 2017, 5:22pm

I tried this, but then s got the error: Error in select_parameters(explicit = pars, patterns = regex_pars, complete = parnames) : Some 'pars' don't match parameter names: yT

Changed it to c("yT"), still got that error.

For the coda functions, I got this error, which I have no idea where it came from:

trace(As.mcmc.list(stanfit))
Error in .primTrace(what) : argument must be a function

ghjk · September 20, 2017, 5:22pm

Thank you for your help!! I tried it out, but got the error:

  trace(As.mcmc.list(stanfit))
  Error in .primTrace(what) : argument must be a function

bgoodri · September 20, 2017, 5:36pm

Did you rename your parameters? Anyway, maybe just call it with mcmc_trace(as.array(stanfit)).

Topic		Replies	Views
Syntactically correct STAN model not working Modeling fitting-issues	7	1020	August 19, 2020
Stan model runs fine in R terminal but errors when compiled in R markdown document RStan rstan , fitting-issues	4	901	September 4, 2020
Mismatch in number dimensions error Modeling rstan , fitting-issues , specification	3	1309	November 20, 2020
MCMC sampling with high-dimensional data Modeling	3	1854	May 22, 2018
Unexpected dimension mismatch for integer vector in data initialization Modeling	4	656	June 29, 2019

MCMC sampling does not work when execute

Related topics