Estimating parameters of a SEIR model

Ilia · May 16, 2020, 10:19pm

Recently I have opened a topic here seeking a solution to the error I was getting while running stan. So far, I have managed to polish my code using the advice I got from that topic. Here is my stan code:

cat(
  '
functions {
  
  real[] SI(real t,
            real[] y,
            real[] params,
            real[] x_r,
            int[] x_i) {
      
      real dydt[8];
      
    
      real phi = params[1];
      real pe = params[2];
      real pie = params[3];
      real pee = params[4]; 
  
      real beta = params[5];
      real mu = params[6]; 
      real betam = params[7]; 
      real tau = params[8]; 
      real alpha = params[9]; 
      real Ts = params[10]; 
      real Ta = params[11]; 
      real Tm = params[12]; 
      real Ti = params[13]; 
      
    dydt[1] = - mu * beta * y[1] * y[6] - beta * y[1] * y[5] - beta * y[8] * y[1] - betam * y[7] * y[1];
    dydt[2] = mu * beta * y[1] * y[6] + (1 - phi) * beta * y[1] * y[5] + (1 - phi) * beta * y[8] * y[1] + (1 - phi) * betam * y[7] * y[1] - y[2] / tau;
    dydt[3] = pe * phi * beta * y[1] * y[5] + pie * phi * beta * y[8] * y[1] + pee * phi * betam * y[7] * y[1] - y[3] / tau;
    dydt[4] = (1 - pe) * phi * beta * y[1] * y[5] + (1 - pie) * phi * beta * y[8] * y[1] + (1 - pee) * phi * betam * y[7] * y[1] - y[4] / tau;
    dydt[5] = alpha * y[2] / tau - y[5] / Ts;
    dydt[6] = (1 - alpha) * y[2] / tau - y[6] / Ta;
    dydt[7] = y[3] / tau - y[7] / Tm;
    dydt[8] = y[4] / tau - y[8] / Ti;
      
      return dydt;
    }
  
}

data {
  int<lower = 1> n_obs; // Number of days sampled
  int<lower = 1> n_params; // Number of model parameters
  int<lower = 1> n_difeq; // Number of differential equations in the system
  int<lower = 1> n_sample; 
  int y[n_obs]; // The binomially distributed data
  real t0; // Initial time point (zero)
  real ts[n_obs]; // Time points that were sampled
  
}

transformed data {
  real x_r[0];
  int x_i[0];
}

parameters {
  real<lower=0, upper=1> params1[4];
  real<lower = 0> params2[9]; 
  real<lower = 0, upper = 1> S0; // Initial fraction of susceptible
}

transformed parameters{
  real y_hat[n_obs, n_difeq]; // Output from the ODE solver
  real y0[n_difeq]; // Initial conditions for both S and I

  y0[1] = 1 - 0.02 - 0.001;
  y0[2] = 0.02;
  y0[3] = 0;
  y0[4] = 0;
  y0[5] = 0.001;
  y0[6] = 0;
  y0[7] = 0;
  y0[8] = 0;
  
  y_hat = integrate_ode_rk45(SI, y0, t0, ts, append_array(params1, params2), x_r, x_i);
  y_hat = 1e-6 + (1-2e-6)*y_hat;
  
}

model {
  phi ~ normal(0, 1);  //phi
  pe ~ normal(0, 1); //pe
  pie ~ normal(0, 1); //pie
  pee ~ normal(0, 1); //pee
  
  beta ~ normal(0.5, 0.9); //beta
  mu ~ normal(0.2, 0.55); //mu
  betam ~ normal(0, 1);  //beta_m
  tau ~ normal(1, 20); //tau
  alpha ~ normal(0.1, 0.99); //alpha
  Ts ~ normal(3, 20); //Ts
  Ta ~ normal(1, 20); //Ta
  Tm ~ normal(1, 20); //Tm
  Ti ~ normal(0.5, 8); //Ti
  S0 ~ normal(0.5, 0.5); //constrained to be 0-1.
  
  y ~ binomial(n_sample, y_hat[,5]); //y_hat[,5] are the fractions infected from the ODE solver
  
}

generated quantities {
  real R_0; 
  R_0 = beta * Ta * (alpha * Ts * (1 - phi) / Ta + mu * (1 - alpha));
  
}

', 
file = "SI_fit.stan", sep="", fill=T)

As you can see I have 8 ODEs and 13 parameters to estimate, 4 of these parameters namely phi, pe, pie and pee are probabilities and so should be between 0 and 1. I was suggested to distinguish my parameters - those that are probabilities and those which are not, so I now have:

parameters {
  real<lower=0, upper=1> params1[4];
  real<lower = 0> params2[9]; 
  ...
}

and then in transformed parameters my y_hat is:

transformed parameters{
...
  
  y_hat = integrate_ode_rk45(SI, y0, t0, ts, append_array(params1, params2), x_r, x_i);
  y_hat = 1e-6 + (1-2e-6)*y_hat;
  
}

where I am appending params1 (probabilities) and params2 (the rest of the parameters). As you can see in the stan code now I have 2 names for my parameters: params then in parameters I separate them into params1 and params2 I wonder if this makes any issue on running the stan code as it simply stucks at running stan(). I am adding my whole code to this topic so one can run it at ease. Any help would be much appreciated by this stan newbie.

I am also dubious on my definition of model on R I defined:

stan_d = list(n_obs = N,
              n_params = 13,
              n_difeq = 8,
              n_sample = 66650000,
              y = cases,
              t0 = 0,
              ts = sample_time)

where N is the length of my list in days, I wonder if I need the n_sample which is the total population?

Issue1.R (4.6 KB)

maxbiostat · May 16, 2020, 11:18pm

I didn’t quite get what the problem is. Can you provide more detail, perhaps showing the output of the stan() call?

Ilia · May 16, 2020, 11:50pm

Thanks for your time, so in the beginning of the stan function I only have params defined. I wonder if separating params into params1 and params2 is allowed the way I did it? I mean I have not seen such indexing in other languages. I have 13 parameters and separated them as params1[4] (which are probabilities) and params2[9]. So I wonder how stan is keeping track of them? As later on at stan_d() I have n_params = 13 without separation. This is my confusion.

As for the rest of the code when I run it I get:
Error in close.connection(zz) : cannot close 'message' sink connection
happening at test = stan() when I run debugging I get,

function (file, model_code = "", model_name = "anon_model", 
  verbose = FALSE, obfuscate_model_name = TRUE, allow_undefined = FALSE, 
  isystem = c(if (!missing(file)) dirname(file), getwd())) 
{
  model_name2 <- deparse(substitute(model_code))
  if (is.null(attr(model_code, "model_name2"))) 
    attr(model_code, "model_name2") <- model_name2
  model_code <- get_model_strcode(file, model_code)
  if (missing(model_name) || is.null(model_name)) 
    model_name <- attr(model_code, "model_name2")
  if (verbose) 
    cat("\nTRANSLATING MODEL '", model_name, "' FROM Stan CODE TO C++ CODE NOW.\n", 
      sep = "")
  SUCCESS_RC <- 0
  EXCEPTION_RC <- -1
  PARSE_FAIL_RC <- -2
  model_cppname <- legitimate_model_name(model_name, obfuscate_name = obfuscate_model_name)
  tf <- tempfile(fileext = ".parser")
  zz <- base::file(tf, open = "wt")
  on.exit(close(zz), add = TRUE)
  sink(zz, type = "message")
  r <- .Call(CPP_stanc280, model_code, model_cppname, allow_undefined, 
    isystem)
  sink(type = "message")
  close(zz)
  on.exit(NULL)
  r$model_name <- model_name
  r$model_code <- model_code
  if (is.null(r)) {
    stop(paste("failed to run stanc for model '", model_name, 
      "' and no error message provided", sep = ""))
  }
  else if (r$status == PARSE_FAIL_RC) {
    stop(paste("failed to parse Stan model '", model_name, 
      "' and no error message provided"), sep = "")
  }
  else if (r$status == EXCEPTION_RC) {
    lapply(r$msg, function(x) message(x))
    error_msg <- paste("failed to parse Stan model '", model_name, 
      "' due to the above error.", sep = "")
    stop(error_msg)
  }
  if (r$status == SUCCESS_RC && verbose) 
    cat("successful in parsing the Stan model '", model_name, 
      "'.\n", sep = "")
  msg <- readLines(tf)
  msg <- grep("Unknown variable", msg, value = TRUE, invert = TRUE)
  msg <- grep("aliasing", msg, value = TRUE, invert = TRUE)
  if (length(msg) > 2L) {
    cat(msg, sep = "\n")
  }
  else {
    try(file.remove(tf), silent = TRUE)
  }
  r$status = !as.logical(r$status)
  return(r)
}

and then:

function (file, model_name = "anon_model", model_code = "", 
  stanc_ret = NULL, boost_lib = NULL, eigen_lib = NULL, save_dso = TRUE, 
  verbose = FALSE, auto_write = rstan_options("auto_write"), 
  obfuscate_model_name = TRUE, allow_undefined = FALSE, includes = NULL, 
  isystem = c(if (!missing(file)) dirname(file), getwd())) 
{
  if (is.null(stanc_ret)) {
    model_name2 <- deparse(substitute(model_code))
    if (is.null(attr(model_code, "model_name2"))) 
      attr(model_code, "model_name2") <- model_name2
    if (missing(model_name)) 
      model_name <- NULL
    if (missing(file)) {
      tf <- tempfile()
      writeLines(model_code, con = tf)
      file <- file.path(dirname(tf), paste0(tools::md5sum(tf), 
        ".stan"))
      if (!file.exists(file)) 
        file.rename(from = tf, to = file)
      else file.remove(tf)
    }
    else file <- normalizePath(file)
    stanc_ret <- stanc(file = file, model_code = model_code, 
      model_name = model_name, verbose = verbose, obfuscate_model_name = obfuscate_model_name, 
      allow_undefined = allow_undefined, isystem = isystem)
    model_re <- "(^[[:alnum:]]{2,}.*$)|(^[A-E,G-S,U-Z,a-z].*$)|(^[F,T].+)"
    if (!is.null(model_name)) 
      if (!grepl(model_re, model_name)) 
        stop("model name must match ", model_re)
    S4_objects <- apropos(model_re, mode = "S4", ignore.case = FALSE)
    if (length(S4_objects) > 0) {
      e <- environment()
      stanfits <- sapply(mget(S4_objects, envir = e, inherits = TRUE), 
        FUN = is, class2 = "stanfit")
      stanmodels <- sapply(mget(S4_objects, envir = e, 
        inherits = TRUE), FUN = is, class2 = "stanmodel")
      if (any(stanfits)) 
        for (i in names(which(stanfits))) {
          obj <- get_stanmodel(get(i, envir = e, inherits = TRUE))
          if (identical(obj@model_code[1], stanc_ret$model_code[1])) 
            return(obj)
        }
      if (any(stanmodels)) 
        for (i in names(which(stanmodels))) {
          obj <- get(i, envir = e, inherits = TRUE)
          if (identical(obj@model_code[1], stanc_ret$model_code[1])) 
            return(obj)
        }
    }
    mtime <- file.info(file)$mtime
    file.rds <- gsub("stan$", "rds", file)
    md5 <- tools::md5sum(file)
    if (!file.exists(file.rds)) {
      file.rds <- file.path(tempdir(), paste0(md5, ".rds"))
    }
    if (!file.exists(file.rds) || (mtime.rds <- file.info(file.rds)$mtime) < 
      as.POSIXct(packageDescription("rstan")$Date) || 
      !is(obj <- readRDS(file.rds), "stanmodel") || !is_sm_valid(obj) || 
      (!identical(stanc_ret$model_code, obj@model_code) && 
        is.null(message("hash mismatch so recompiling; make sure Stan code ends with a blank line"))) || 
      avoid_crash(obj@dso@.CXXDSOMISC$module) && is.null(message("recompiling to avoid crashing R session"))) {
    }
    else return(invisible(obj))
  }
  if (!is.list(stanc_ret)) {
    stop("stanc_ret needs to be the returned object from stanc.")
  }
  m <- match(c("cppcode", "model_name", "status"), names(stanc_ret))
  if (any(is.na(m))) {
    stop("stanc_ret does not have element `cppcode', `model_name', and `status'")
  }
  else {
    if (!stanc_ret$status) 
      stop("stanc_ret is not a successfully returned list from stanc")
  }
  if (.Platform$OS.type != "windows") {
    CXX <- get_CXX()
    if (!is.null(attr(CXX, "status")) || nchar(CXX) == 0) {
      WIKI <- "https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started"
      warning(paste("C++ compiler not found on system. If absent, see\n", 
        WIKI))
    }
    else if (grepl("69", CXX, fixed = TRUE)) 
      warning("You may need to launch Xcode once to accept its license")
  }
  else CXX <- "g++"
  model_cppname <- stanc_ret$model_cppname
  model_name <- stanc_ret$model_name
  model_code <- stanc_ret$model_code
  model_cppcode <- stanc_ret$cppcode
  inc <- paste("#define STAN__SERVICES__COMMAND_HPP", "#include <boost/integer/integer_log2.hpp>\n", 
    "#include <rstan/rstaninc.hpp>\n", if (is.null(includes)) 
      model_cppcode
    else sub("(class[[:space:]]+[A-Za-z_][A-Za-z0-9_]*[[:space:]]*)", 
      paste(includes, "\\1"), model_cppcode), get_Rcpp_module_def_code(model_cppname), 
    sep = "")
  if (verbose && interactive()) 
    cat("COMPILING THE C++ CODE FOR MODEL '", model_name, 
      "' NOW.\n", sep = "")
  if (verbose) 
    cat(system_info(), "\n")
  if (!is.null(boost_lib)) {
    old.boost_lib <- rstan_options(boost_lib = boost_lib)
    on.exit(rstan_options(boost_lib = old.boost_lib))
  }
  if (!file.exists(rstan_options("boost_lib"))) 
    stop("Boost not found; call install.packages('BH')")
  if (!is.null(eigen_lib)) {
    old.eigen_lib <- rstan_options(eigen_lib = eigen_lib)
    on.exit(rstan_options(eigen_lib = old.eigen_lib), add = TRUE)
  }
  if (!file.exists(rstan_options("eigen_lib"))) 
    stop("Eigen not found; call install.packages('RcppEigen')")
  dso <- cxxfunctionplus(signature(), body = paste(" return Rcpp::wrap(\"", 
    model_name, "\");", sep = ""), includes = inc, plugin = "rstan", 
    save_dso = save_dso | auto_write, module_name = paste("stan_fit4", 
      model_cppname, "_mod", sep = ""), verbose = verbose)
  if (FALSE && grepl("#include", model_code, fixed = TRUE)) {
    model_code <- scan(text = model_code, what = character(), 
      sep = "\n", quiet = TRUE)
    model_code <- gsub("#include /", "#include ", model_code, 
      fixed = TRUE)
    model_code <- gsub("#include (.*$)", "#include \"\\1\"", 
      model_code)
    unprocessed <- tempfile(fileext = ".stan")
    processed <- tempfile(fileext = ".stan")
    on.exit(file.remove(c(unprocessed, processed)))
    writeLines(model_code, con = unprocessed)
    ARGS <- paste("-E -nostdinc -x c++ -P -C", paste("-I", 
      isystem, " ", collapse = ""), "-o", processed, unprocessed)
    pkgbuild::with_build_tools(system2(CXX, args = ARGS), 
      required = rstan_options("required") && identical(Sys.getenv("WINDOWS"), 
        "TRUE") && !identical(Sys.getenv("R_PACKAGE_SOURCE"), 
        ""))
    if (file.exists(processed)) 
      model_code <- paste(readLines(processed), collapse = "\n")
  }
  obj <- new("stanmodel", model_name = model_name, model_code = model_code, 
    dso = dso, mk_cppmodule = mk_cppmodule, model_cpp = list(model_cppname = model_cppname, 
      model_cppcode = model_cppcode))
  if (missing(file) || (file.access(dirname(file), mode = 2) != 
    0) || !isTRUE(auto_write)) {
    tf <- tempfile()
    writeLines(model_code, con = tf)
    file <- file.path(tempdir(), paste0(tools::md5sum(tf), 
      ".stan"))
    if (!file.exists(file)) 
      file.rename(from = tf, to = file)
    else file.remove(tf)
    saveRDS(obj, file = gsub("stan$", "rds", file))
  }
  else if (isTRUE(auto_write)) {
    file <- gsub("stan$", "rds", file)
    if (file.exists(file)) {
      rds <- try(readRDS(file), silent = TRUE)
      if (!is(rds, "stanmodel")) 
        warning(rds, " exists but is not a 'stanmodel' so not overwriting")
      else saveRDS(obj, file = file)
    }
    else saveRDS(obj, file = file)
  }
  invisible(obj)
}

this is using debug(rstan::stanc)

maxbiostat · May 17, 2020, 12:48am

Yep, seems to be perfectly valid syntax.

Your program has a few issues, however. First, you need to define things like phi, beta, etc, in the parameters block. I’ve done that for you this time around. Secondly, you cannot add the numerical correction

  y_hat = integrate_ode_rk45(SI, y0, t0, ts, {phi, pe, pie, pee, beta, mu, betam, tau, alpha, Ts, Ta, Tm, Ti}, x_r, x_i);
  y_hat = 1e-6 + (1-2e-6)*y_hat;

I have hacked this monstrosity

  y_hat = integrate_ode_rk45(SI, y0, t0, ts, {phi, pe, pie, pee, beta, mu, betam, tau, alpha, Ts, Ta, Tm, Ti}, x_r, x_i);
  for(i in 1:n_obs){
    for(j in 1:n_difeq){
       y_hat[i, j] = 1e-6 + (1-2e-6)*y_hat[i, j];
    }
  }

and it will suffice. Folks like @bbbales2 and @nhuurre will know more efficient/elegant ways of adding the numerical correction.

Full program:

functions {
  
  real[] SI(real t,
            real[] y,
            real[] params,
            real[] x_r,
            int[] x_i) {
      
      real dydt[8];
      
    
      real phi = params[1];
      real pe = params[2];
      real pie = params[3];
      real pee = params[4]; 
  
      real beta = params[5];
      real mu = params[6]; 
      real betam = params[7]; 
      real tau = params[8]; 
      real alpha = params[9]; 
      real Ts = params[10]; 
      real Ta = params[11]; 
      real Tm = params[12]; 
      real Ti = params[13]; 
      
    dydt[1] = - mu * beta * y[1] * y[6] - beta * y[1] * y[5] - beta * y[8] * y[1] - betam * y[7] * y[1];
    dydt[2] = mu * beta * y[1] * y[6] + (1 - phi) * beta * y[1] * y[5] + (1 - phi) * beta * y[8] * y[1] + (1 - phi) * betam * y[7] * y[1] - y[2] / tau;
    dydt[3] = pe * phi * beta * y[1] * y[5] + pie * phi * beta * y[8] * y[1] + pee * phi * betam * y[7] * y[1] - y[3] / tau;
    dydt[4] = (1 - pe) * phi * beta * y[1] * y[5] + (1 - pie) * phi * beta * y[8] * y[1] + (1 - pee) * phi * betam * y[7] * y[1] - y[4] / tau;
    dydt[5] = alpha * y[2] / tau - y[5] / Ts;
    dydt[6] = (1 - alpha) * y[2] / tau - y[6] / Ta;
    dydt[7] = y[3] / tau - y[7] / Tm;
    dydt[8] = y[4] / tau - y[8] / Ti;
      
      return dydt;
    }
  
}

data {
  int<lower = 1> n_obs; // Number of days sampled
  int<lower = 1> n_params; // Number of model parameters
  int<lower = 1> n_difeq; // Number of differential equations in the system
  int<lower = 1> n_sample; 
  int y[n_obs]; // The binomially distributed data
  real t0; // Initial time point (zero)
  real ts[n_obs]; // Time points that were sampled
  
}

transformed data {
  real x_r[0];
  int x_i[0];
}

parameters {
 real<lower = 0, upper = 1> phi;
 real<lower = 0, upper = 1> pe;
 real<lower = 0, upper = 1> pie;
 real<lower = 0, upper = 1> pee;
 real<lower = 0> beta;
 real<lower = 0> mu;
 real<lower = 0>betam;
 real<lower = 0> tau;
 real<lower = 0> alpha;
 real<lower = 0> Ts;
 real<lower = 0> Ta;
 real<lower = 0> Tm;
 real<lower = 0> Ti;
  real<lower = 0, upper = 1> S0; // Initial fraction of susceptible
}

transformed parameters{
  real y_hat[n_obs, n_difeq]; // Output from the ODE solver
  real y0[n_difeq]; // Initial conditions for both S and I
  real<lower=0, upper=1> params1[4];
  real<lower = 0> params2[9]; 

  y0[1] = 1 - 0.02 - 0.001;
  y0[2] = 0.02;
  y0[3] = 0;
  y0[4] = 0;
  y0[5] = 0.001;
  y0[6] = 0;
  y0[7] = 0;
  y0[8] = 0;
  
  y_hat = integrate_ode_rk45(SI, y0, t0, ts, {phi, pe, pie, pee, beta, mu, betam, tau, alpha, Ts, Ta, Tm, Ti}, x_r, x_i);
  for(i in 1:n_obs){
    for(j in 1:n_difeq){
       y_hat[i, j] = 1e-6 + (1-2e-6)*y_hat[i, j];
    }
  }
  
}

model {
  phi ~ normal(0, 1);  //phi
  pe ~ normal(0, 1); //pe
  pie ~ normal(0, 1); //pie
  pee ~ normal(0, 1); //pee
  
  beta ~ normal(0.5, 0.9); //beta
  mu ~ normal(0.2, 0.55); //mu
  betam ~ normal(0, 1);  //beta_m
  tau ~ normal(1, 20); //tau
  alpha ~ normal(0.1, 0.99); //alpha
  Ts ~ normal(3, 20); //Ts
  Ta ~ normal(1, 20); //Ta
  Tm ~ normal(1, 20); //Tm
  Ti ~ normal(0.5, 8); //Ti
  S0 ~ normal(0.5, 0.5); //constrained to be 0-1.
  
  y ~ binomial(n_sample, y_hat[,5]); //y_hat[,5] are the fractions infected from the ODE solver
  
}

generated quantities {
  real R_0; 
  R_0 = beta * Ta * (alpha * Ts * (1 - phi) / Ta + mu * (1 - alpha));
  
}

Hope this helps.

PS: I found all of those problems using Rstudio’s Stan code checking. I have no idea what the errors you report are. What interface are you using?

bbbales2 · May 17, 2020, 12:56am

This is probably relevant: https://mc-stan.org/users/documentation/case-studies/boarding_school_case_study.html . Not the exact model, but something similar.

Ilia · May 17, 2020, 6:31pm

Thanks a lot for this. It surely does help. Now I understood how I should have organised the parameters. Now when I run the program however I encounter the following:

hash mismatch so recompiling; make sure Stan code ends with a blank line
no parameter params; sampling not done

at test = stan()and then

starting worker pid=3016 on localhost:11025 at 19:22:06.591
starting worker pid=3030 on localhost:11025 at 19:22:06.781
starting worker pid=3044 on localhost:11025 at 19:22:06.967
no parameter params; sampling not done
no parameter params; sampling not done
no parameter params; sampling not done
some chains had errors; consider specifying chains = 1 to debughere are whatever error messages were returned
[[1]]
Stan model 'SI_fit' does not contain samples.

[[2]]
Stan model 'SI_fit' does not contain samples.

[[3]]
Stan model 'SI_fit' does not contain samples.

at mod = stan(). I wonder how I many address this?

As for the RStudio I’m using version 1.2.5033, and my R is version 3.6.3 and my rstan is version 2.19.3. I am running this on MacOs Catalina unfortunately which I know has issues. I don’t get the errors as you do for me it only shows connection error which I shared. I tried to upgrade my R to version 4 before yet after I have had problems installing rstan correctly. So I had to downgrade to what I am using now to at least be able to run the code.

Ilia · May 17, 2020, 6:31pm

Thanks a lot I must check this.

maxbiostat · May 17, 2020, 6:38pm

Start by specifying nchains=1 as instructed in the error message. This should give you more informative errors.

Ilia · May 18, 2020, 4:38am

So my latest update is as follow:

I am following the link shared by @bbbales2. My data contains daily number of confirmed cases of covid-19 in the UK. My stan code is as follow:

 functions {
  
  real[] sir(real t,
            real[] y,
            real[] theta,
            real[] x_r,
            int[] x_i) {
      
      // I renamed my ODEs variables here: 
      real S = y[1];
      real E = y[2];
      real G = y[3];
      real P = y[4];
      real I = y[5];
      real A = y[6];
      real B = y[7];
      real C = y[8];
      real N = x_i[1];
      
      // Here are the list of my parameters
      real phi = theta[1];
      real pe = theta[2];
      real pie = theta[3];
      real pee = theta[4]; 
      real beta = theta[5];
      real mu = theta[6]; 
      real betam = theta[7]; 
      real tau = theta[8]; 
      real alpha = theta[9]; 
      real Ts = theta[10]; 
      real Ta = theta[11]; 
      real Tm = theta[12]; 
      real Ti = theta[13]; 

     // here are the ODEs. I notice that in the link they define them as real dS_dt rather than dydt[1] 
      real dS_dt = - mu * beta * S * A - beta * S * I - beta * C * S - betam * B * S;
      real dE_dt = mu * beta * S * A + (1 - phi) * beta * S * I + (1 - phi) * beta * C * S + (1 - phi) * betam * B * S - E / tau;
      real dG_dt = pe * phi * beta * S * I + pie * phi * beta * C * S + pee * phi * betam * B * S - G / tau;
      real dP_dt = (1 - pe) * phi * beta * S * I + (1 - pie) * phi * beta * C * S + (1 - pee) * phi * betam * B * S - P / tau;
      real dI_dt = alpha * E / tau - I / Ts;
      real dA_dt = (1 - alpha) * E / tau - A / Ta;
      real dB_dt = G / tau - B / Tm;
      real dC_dt = P / tau - C / Ti;  
      
      return {dS_dt, dE_dt, dG_dt, dP_dt, dI_dt, dA_dt, dB_dt, dC_dt};
    }
  
}

data {
  int<lower=1> n_days;
  real y0[8];
  real t0;
  real ts[n_days];
  int N;
  int cases[n_days];
  
}

transformed data {
  real x_r[0];
  int x_i[1] = { N };
}

parameters {
 real<lower = 0, upper = 1> phi;
 real<lower = 0, upper = 1> pe;
 real<lower = 0, upper = 1> pie;
 real<lower = 0, upper = 1> pee;
 real<lower = 0> beta;
 real<lower = 0> mu;
 real<lower = 0>betam;
 real<lower = 0> tau;
 real<lower = 0> alpha;
 real<lower = 0> Ts;
 real<lower = 0> Ta;
 real<lower = 0> Tm;
 real<lower = 0> Ti;
}

transformed parameters{
  real y_hat[n_days, 8];
  real<lower=0, upper=1> theta1[4];
  real<lower = 0> theta2[9]; 
  
  y_hat = integrate_ode_rk45(sir, y0, t0, ts, {phi, pe, pie, pee, beta, mu, betam, tau, alpha, Ts, Ta, Tm, Ti}, x_r, x_i);
  for(i in 1:n_days){
    for(j in 1:8){
       y_hat[i, j] = 1e-6 + (1-2e-6)*y_hat[i, j];
    }
  }
  
}

model {
  phi ~ normal(0, 1);  //phi
  pe ~ normal(0, 1); //pe
  pie ~ normal(0, 1); //pie
  pee ~ normal(0, 1); //pee
  
  beta ~ normal(0.5, 0.9); //beta
  mu ~ normal(0.2, 0.55); //mu
  betam ~ normal(0, 1);  //beta_m
  tau ~ normal(1, 20); //tau
  alpha ~ normal(0.1, 0.99); //alpha
  Ts ~ normal(3, 20); //Ts
  Ta ~ normal(1, 20); //Ta
  Tm ~ normal(1, 20); //Tm
  Ti ~ normal(0.5, 8); //Ti
  
  cases ~ binomial(col(to_matrix(y_hat, 5))); //y_hat[,5] are the fractions infected from the ODE solver
  
}

generated quantities {
  real R_0; 
  R_0 = beta * Ta * (alpha * Ts * (1 - phi) / Ta + mu * (1 - alpha));
  
}

But I can not run this again, as it says:

Error in close.connection(zz) : cannot close 'message' sink connection

when compiling the stan file. My R code is here along with other files. I wonder what am I doing wrong? any help is much appreciated. Could this be a problem at model block? I am defining the likelihood as binomial for infections which is the 5th ODE.

I also wonder why am I getting error as above only? while others seem to get pinpointed on where the error is happening?

mn.stan (3.0 KB) rcode1.R (982 Bytes) xyzz.csv (1.8 KB)

bbbales2 · May 18, 2020, 11:35am

There seems to be a syntax error on this line:

cases ~ binomial(col(to_matrix(y_hat, 5)));

I changed the likelihood to a normal (just as a placeholder to get things to compile), and then I got these errors when I ran the model:

[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                                
[2] "  Exception: Max number of iterations exceeded (1000000).  (in 'modelda7739d7107_mn' at line 87)"
error occurred during calling the sampler; sampling not done
here are whatever error messages were returned

Which indicate that the sampler wasn’t able to initialize itself cause the ODE solver failed. Are you getting these errors as well?

Ilia · May 19, 2020, 5:31am

Thank you @bbbales2. So I guess I know where the error could happen. Let me go through it step by step. First unfortunately the only error I get is the
Error in close.connection(zz) : cannot close 'message' sink connection
and I assume this is because I have Catalina and R version 3.6.3 and when I try to update this to R version 4 I encounter many other issues, to begin on the updated version I can not get rstan running. This is why unfortunately I am using the older version.

Having said that if you look at the link you’ve shared the way they define the likelihood is as follow:

cases ~ neg_binomial_2(col(to_matrix(y), 2), phi);

this is negative binomial. Yet what’s important there is a parameter phi involved which I assume it takes the noise into account. I didn’t have that and I am not sure if I must have this in defining likelihood.

Secondly I think this is equivalent of my first stan code which got corrected by @maxbiostat, in there we have:

y ~ binomial(n_sample, y_hat[,5]);

again there is a parameter n_sample which was defined in data block.

In the first version of stan code corrected by @maxbiostat I was running:

test = stan("SI_fit.stan",
            data = stan_d,
            pars = params_monitor,
            chains = 1, iter = 10)

test = stan("SI_fit.stan",
            data = stan_d,
            pars = params_monitor,
            chains = 1, iter = 10)

 mod = stan(fit = test,
           data = stan_d,
           pars = params_monitor,
           chains = 3,
           warmup = 500,
           iter = 1000,
           control = list(adapt_delta = 0.99))

on R and I was getting the error:

no parameter params; sampling not done
Stan model 'SI_fit' does not contain samples.

So I guess the issue boils down to n_sample or adding a parameter for the noise. I am not sure how to do this right, any suggestions on this would be kind? Note that in above I defined n_sample as the UK population size and it did not work

Ilia · May 19, 2020, 6:05am

Also note that there is a typo, following the link you’ve shared it should be:

cases ~ binomial(col(to_matrix(y_hat), 5));

bbbales2 · May 19, 2020, 1:47pm

Oof, yeah, I think this is worth a separate thread. I do not know the Catalina/R issues (Linux here).

Usually the question with binary data is is your data the outcomes of N trials? Or is it the count of something occurring at some rate?

That sorta leads the discussion of what likelihood we’ll pick. The first we’ll usually go bernoulli/binomial. The second looks like a poisson, but usually when you have a poisson you can do better by just fitting a negative binomial (because reality wasn’t actually exactly poisson or whatever).

Ilia · May 19, 2020, 4:14pm

Thank you @bbbales2 for your reply. So my data is a vector as follow:

0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 5, 0, 1, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 2, 5, 3, 13, 4, 11, 34, 30, \
48, 43, 67, 48, 61, 74, 0, 342, 342, 0, 403, 407, 676, 63, 1294, \
1035, 665, 967, 1427, 1452, 2129, 2885, 2546, 2433, 2619, 3009, 4324, \
4244, 4450, 3735, 5903, 3802, 3634, 5491, 4344, 8681, 5233, 5288, \
4342, 5252, 4603, 4617, 5599, 5525, 5850, 4676, 4301, 4451, 4583, \
5386, 4913, 4463, 4309, 3996, 4076, 6032, 6201, 4806, 4339, 3985, \
4406, 6111, 5614, 4649, 3896, 3923, 3877, 3403, 3242, 3446, 3560, 3450

each entry is daily number of infected people to Covid-19. So there is no trial per day as each day has one number. This would be bell shaped which can be described by binomial or poission indeed. So I imagine one would not need n_sample is that so?

and to make it complete, this data is the number of infections as I mentioned and so the 5th ODE is the one we need to fit them into.

bbbales2 · May 20, 2020, 10:42am

The bell shape here is on the time axis.

The likelihood is for describing each outcome. So you have a model for the number of underlying infections (I[t]) and you have you measurements (y[t]).

Your outcomes aren’t trials, so you might say that you expect your measurements to be Poisson distributed about I[t]:

y[t] ~ poisson(I[t]);

Usually this model doesn’t fit super well. Like, if you take your data, fit it, and then check the fit vs. the data like in the case study it won’t be great.

So then move to a negative binomial. In that case you also need to fit an overdispersion parameter. That parameter might seem a bit weird, so just generate data in R to get a handle on what it does:

Here’s the R docs and the stan docs.

I think rnbinom(n, size = phi, mu = mu) is the parameterization you want to convert from Stan parameters (mu and phi) to R parameters (mu and size). Might want to double check that though.

Have a look at the references in the case study and also the references Riou 2020 and Flaxman 2020. Look at how they do things. As I understand it there’s a lot going on in covid-19 data analysis that goes beyond picking a likelihood and an SEIR model – but you can probably get a better sense of what’s going on by looking at those papers and seeing how they do this.

Ilia · May 22, 2020, 7:00pm

Thank you @bbbales2 for your guidance. My issue at the moment is that I can not get the stan run for basic binomial. Let me go through it part by part so maybe you or someone could help me:

First of all I have my function, there are 8 ODEs and the 5th one is the infection which I want to fit my data into. For this I have:

functions {
  
  real[] SI(real t,
            real[] y,
            real[] params,
            real[] x_r,
            int[] x_i) {
      
      real dydt[8];

      // Variables of the ODEs

      real S = y[1];
      real E = y[2];
      real G = y[3];
      real P = y[4];
      real I = y[5];
      real A = y[6];
      real B = y[7];
      real C = y[8];
      
      // The parameters of the ODEs
      real phi = params[1];
      real pe = params[2];
      real pie = params[3];
      real pee = params[4]; 
  
      real beta = params[5];
      real mu = params[6]; 
      real betam = params[7]; 
      real tau = params[8]; 
      real alpha = params[9]; 
      real Ts = params[10]; 
      real Ta = params[11]; 
      real Tm = params[12]; 
      real Ti = params[13]; 
      
      dydt[1] = - mu * beta * S * A - beta * S * I - beta * C * S - betam * B * S;
      dydt[2] = mu * beta * S * A + (1 - phi) * beta * S * I + (1 - phi) * beta * C * S + (1 - phi) * betam * B * S - E / tau;
      dydt[3] = pe * phi * beta * S * I + pie * phi * beta * C * S + pee * phi * betam * B * S - G / tau;
      dydt[4] = (1 - pe) * phi * beta * S * I + (1 - pie) * phi * beta * C * S + (1 - pee) * phi * betam * B * S - P / tau;
      dydt[5] = alpha * E / tau - I / Ts;
      dydt[6] = (1 - alpha) * E / tau - A / Ta;
      dydt[7] = G / tau - B / Tm;
      dydt[8] = P / tau - C / Ti;
      return dydt;
    }

In this block I have defined 13 parameters of the ODEs with first 4 being probabilities. Then the data block is:

data {
  int<lower = 1> n_obs; // Number of days sampled
  int<lower = 1> n_params; // Number of model parameters
  int<lower = 1> n_difeq; // Number of differential equations in the system
  int y[n_obs]; // The binomially distributed data
  real t0; // Initial time point (zero)
  real ts[n_obs]; // Time points that were sampled
}

I have labelled them so one can understand what is what easily. My transformed data is the usual:

transformed data {
  real x_r[0];
  int x_i[0];
}

then I have my parameters:

parameters {
 real<lower = 0, upper = 1> phi;
 real<lower = 0, upper = 1> pe;
 real<lower = 0, upper = 1> pie;
 real<lower = 0, upper = 1> pee;
 real<lower = 0> beta;
 real<lower = 0> mu;
 real<lower = 0>betam;
 real<lower = 0> tau;
 real<lower = 0> alpha;
 real<lower = 0> Ts;
 real<lower = 0> Ta;
 real<lower = 0> Tm;
 real<lower = 0> Ti;
 real<lower = 0, upper = 1> S0; // Initial fraction of susceptible
}

In this block you can see that S0 is between 0 and 1, this is because I am assuming that total population is equal to 1. So at the end my data plot would show the “portion of people who are infected”. I then have my transformed data :

transformed parameters{
  real y_hat[n_obs, n_difeq]; // Output from the ODE solver
  real y0[n_difeq]; // Initial conditions
  real<lower=0, upper=1> params1[4];
  real<lower = 0> params2[9]; 

  y0[1] = 1 - 0.02 - 0.001;
  y0[2] = 0.02;
  y0[3] = 0;
  y0[4] = 0;
  y0[5] = 0.001;
  y0[6] = 0;
  y0[7] = 0;
  y0[8] = 0;
  y_hat = integrate_ode_rk45(SI, y0, t0, ts, {phi, pe, pie, pee, beta, mu, betam, tau, alpha, Ts, Ta, Tm, Ti}, x_r, x_i);
    }
  }
  
}

As you can see from the initial conditions they are all between 0 and 1 and susceptible individuals are about 0.98 of the population (which is 1). Another note is that I have real<lower=0, upper=1> params1[4]; for the probabilities and real<lower = 0> params2[9]; for the rest of the parameters. The integrator then keeps track of list of the parameters (this is a neat way suggested by @maxbiostat ).

My model finally is:

model {
  phi ~ normal(0, 1);  //phi
  pe ~ normal(0, 1); //pe
  pie ~ normal(0, 1); //pie
  pee ~ normal(0, 1); //pee
  
  beta ~ normal(0.5, 0.9); //beta
  mu ~ normal(0.2, 0.55); //mu
  betam ~ normal(0, 1);  //beta_m
  tau ~ normal(1, 20); //tau
  alpha ~ normal(0.1, 0.99); //alpha
  Ts ~ normal(3, 20); //Ts
  Ta ~ normal(1, 20); //Ta
  Tm ~ normal(1, 20); //Tm
  Ti ~ normal(0.5, 8); //Ti
  S0 ~ normal(0.5, 0.5); //constrained to be 0-1.
  
  y ~ binomial(y_hat[,5]); //y_hat[,5] are the fractions infected from the ODE solver
  
}

As you can see I defined y ~ binomial(y_hat[,5]) where y_hat[,5] are the infected solution of the ODEs. As you mentioned this might not be accurate in the fit, but I am just first trying to get it run, hence I am sharing my method with you.

Finally, on the stan side, I have the following generated quantity:

generated quantities {
  real R_0; 
  R_0 = beta * Ta * (alpha * Ts * (1 - phi) / Ta + mu * (1 - alpha));

On the R side I have the list of the variables needed to run the stan:

stan_d = list(n_obs = length(datalist$cases),
              n_params = 13,
              n_difeq = 8,
              y = datalist$cases/66500000,
              t0 = 0,
              ts = sample_time)

where n_obs is the number of dates or in my case the length of the cases. n_params is the number of parameters of the ODEs, n_difeq is the number of ODEs, y is the number of infected normalised by population so they would be between 0 and 1. and finally ts is the following:

sample_time = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103,
104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116)

I then run the following and test would fail,

# Which parameters to monitor in the model:
params_monitor = c("y_hat", "y0", "params", "R_0")

# Test / debug the model:
test = stan("SI_fit.stan",
            data = stan_d,
            pars = params_monitor,
            chains = 1, iter = 10)

I am running this code on Rstudio and my full R code looks as follow (this includes the stan part):

# importing the libraries
library(tidyverse)
library(deSolve)
library(dplyr)
library(ggplot2)
library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
datalist <- read.csv(file = 'xyzz.csv')


# plotting the data
ggplot(data = datalist) + 
  geom_point(mapping = aes(x = date, y = cases)) + 
  labs(y = "Number of daily cases")



sample_time = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
                20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
                37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
                54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
                71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
                88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103,
                104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116)




# The Stan model statement:
cat(
  '
functions {
  
  real[] SI(real t,
            real[] y,
            real[] params,
            real[] x_r,
            int[] x_i) {
      
      real dydt[8];

      // Variables of the ODEs

      real S = y[1];
      real E = y[2];
      real G = y[3];
      real P = y[4];
      real I = y[5];
      real A = y[6];
      real B = y[7];
      real C = y[8];
      
      // The parameters of the ODEs
      real phi = params[1];
      real pe = params[2];
      real pie = params[3];
      real pee = params[4]; 
  
      real beta = params[5];
      real mu = params[6]; 
      real betam = params[7]; 
      real tau = params[8]; 
      real alpha = params[9]; 
      real Ts = params[10]; 
      real Ta = params[11]; 
      real Tm = params[12]; 
      real Ti = params[13]; 
      
      dydt[1] = - mu * beta * S * A - beta * S * I - beta * C * S - betam * B * S;
      dydt[2] = mu * beta * S * A + (1 - phi) * beta * S * I + (1 - phi) * beta * C * S + (1 - phi) * betam * B * S - E / tau;
      dydt[3] = pe * phi * beta * S * I + pie * phi * beta * C * S + pee * phi * betam * B * S - G / tau;
      dydt[4] = (1 - pe) * phi * beta * S * I + (1 - pie) * phi * beta * C * S + (1 - pee) * phi * betam * B * S - P / tau;
      dydt[5] = alpha * E / tau - I / Ts;
      dydt[6] = (1 - alpha) * E / tau - A / Ta;
      dydt[7] = G / tau - B / Tm;
      dydt[8] = P / tau - C / Ti;
      
      return dydt;
    }
  
}

data {
  int<lower = 1> n_obs; // Number of days sampled
  int<lower = 1> n_params; // Number of model parameters
  int<lower = 1> n_difeq; // Number of differential equations in the system 
  int y[n_obs]; // The binomially distributed data
  real t0; // Initial time point (zero)
  real ts[n_obs]; // Time points that were sampled
  
}

transformed data {
  real x_r[0];
  int x_i[0];
}

parameters {
 real<lower = 0, upper = 1> phi;
 real<lower = 0, upper = 1> pe;
 real<lower = 0, upper = 1> pie;
 real<lower = 0, upper = 1> pee;
 real<lower = 0> beta;
 real<lower = 0> mu;
 real<lower = 0>betam;
 real<lower = 0> tau;
 real<lower = 0> alpha;
 real<lower = 0> Ts;
 real<lower = 0> Ta;
 real<lower = 0> Tm;
 real<lower = 0> Ti;
 real<lower = 0, upper = 1> S0; // Initial fraction of susceptible
}

transformed parameters{
  real y_hat[n_obs, n_difeq]; // Output from the ODE solver
  real y0[n_difeq]; // Initial conditions
  real<lower=0, upper=1> params1[4];
  real<lower = 0> params2[9]; 

  y0[1] = 1 - 0.02 - 0.001;
  y0[2] = 0.02;
  y0[3] = 0;
  y0[4] = 0;
  y0[5] = 0.001;
  y0[6] = 0;
  y0[7] = 0;
  y0[8] = 0;
  
  y_hat = integrate_ode_rk45(SI, y0, t0, ts, {phi, pe, pie, pee, beta, mu, betam, tau, alpha, Ts, Ta, Tm, Ti}, x_r, x_i);
  //for(i in 1:n_obs){
    //for(j in 1:n_difeq){
       //y_hat[i, j] = 1e-6 + (1-2e-6)*y_hat[i, j];
    //}
 // }
  
}

model {
  phi ~ normal(0, 1);  //phi
  pe ~ normal(0, 1); //pe
  pie ~ normal(0, 1); //pie
  pee ~ normal(0, 1); //pee
  
  beta ~ normal(0.5, 0.9); //beta
  mu ~ normal(0.2, 0.55); //mu
  betam ~ normal(0, 1);  //beta_m
  tau ~ normal(1, 20); //tau
  alpha ~ normal(0.1, 0.99); //alpha
  Ts ~ normal(3, 20); //Ts
  Ta ~ normal(1, 20); //Ta
  Tm ~ normal(1, 20); //Tm
  Ti ~ normal(0.5, 8); //Ti
  S0 ~ normal(0.5, 0.5); //constrained to be 0-1.
  
  y ~ binomial(y_hat[,5]); //y_hat[,5] are the fractions infected from the ODE solver
  
}

generated quantities {
  real R_0; 
  R_0 = beta * Ta * (alpha * Ts * (1 - phi) / Ta + mu * (1 - alpha));
  
}

', 
file = "SI_fit.stan", sep="", fill=T)

# FITTING

# For stan model we need the following variables:

stan_d = list(n_obs = length(datalist$cases),
              n_params = 13,
              n_difeq = 8,
              y = datalist$cases/66500000,
              t0 = 0,
              ts = sample_time)

# Which parameters to monitor in the model:
params_monitor = c("y_hat", "y0", "params", "R_0")

# Test / debug the model:
test = stan("SI_fit.stan",
            data = stan_d,
            pars = params_monitor,
            chains = 1, iter = 10)

# Fit and sample from the posterior
mod = stan(fit = test,
           data = stan_d,
           pars = params_monitor,
           chains = 3,
           warmup = 500,
           iter = 1000,
           control = list(adapt_delta = 0.99))

# Extract the posterior samples to a structured list:
posts <- extract(mod)

I am adding the data file that I am using as well in here:
xyzz.csv (1.8 KB)
this is used in datalist <- read.csv(file = 'xyzz.csv'). Through the discussion that we had and as much as I could perfectly look into other models I think my stan code looks ok but I could have missed something with my newbie eyes or I am not sure if I am making any mistake on the R side where I define the stan_d = list() which are the required parameters for stan. It’s also noteworthy that I am using this reference for my code and indeed I can run the code perfectly fine from the link but I can not run mine unfortunately. I’ll be grateful if I eventually can understand where is my mistake happening.

bbbales2 · May 22, 2020, 7:31pm

With the model you wrote, poisson is probably the simplest likelihood, so:

y ~ poisson(y_hat[,5]);

Would be the thing to try.

If you really want to use binomial, you’re missing an argument, and the units of y_hat[, 5] (number of infections), don’t make sense.

A binomial takes two parameters. n number of trials and p probability of successes, and it describes the distribution of the number of successes in those n trials.

Ilia · May 22, 2020, 7:52pm

Thank you @bbbales2. So, for the time being, I changed the likelihood to Poisson and when I run it I get the following error:

Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) : 
  Exception: int variable contained non-int values; processing stage=data initialization; variable name=y; base type=int  (in 'model70544a99286_SI_fit' at line 62)
 [origin: unknown original type]
failed to create the sampler; sampling not done
Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) : 
  Exception: int variable contained non-int values; processing stage=data initialization; variable name=y; base type=int  (in 'model70544a99286_SI_fit' at line 62)
 [origin: unknown original type]
failed to create the sampler; sampling not done
Stan model 'SI_fit' does not contain samples.

I checked this and I came across this post, I wonder if this error happens due to the way I defined t0 and ts in data block:

  real t0; // Initial time point (zero)
  real ts[n_obs]; // Time points that were sampled

or the way I defined sample_time namely:

sample_time = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103,
104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116)

which I used for running the stan code in:

stan_d = list(n_obs = length(datalist$cases),
              n_params = 13,
              n_difeq = 8,
              y = datalist$cases/66500000,
              t0 = 0,
              ts = sample_time)

I would appreciate your guidance.

bbbales2 · May 22, 2020, 8:01pm

So the problem is with y. It is declared as int in the model, but apparently some of the input values are not ints.

Ilia · May 22, 2020, 8:21pm

Indeed so in data block I have rightly so:

int y[n_obs];

and in

stan_d = list(n_obs = length(datalist$cases),
              n_params = 13,
              n_difeq = 8,
              y = datalist$cases/66500000,
              t0 = 0,
              ts = sample_time)

I have checked my y using:

typeof(datalist$cases/66500000)

for which I get double which is double precision. Indeed division by 66.5 million makes the numbers very small and close to 0 so when I do as.integer(datalist$cases/66500000) I have list of 0s. Is there a way to overcome this on stan side?

Topic		Replies	Views
SEIR model fitting Modeling rstan , fitting-issues , epidemiology	10	2210	November 5, 2020
Fitting the model based on C'(t) data Modeling rstan	2	232	February 21, 2024
Fitting seir in r using stan Modeling fitting-issues , epidemiology	5	1383	February 18, 2022
Subject: Fitting and Predicting Influenza Data with SEEIIR Model in Stan Modeling rstan	4	89	October 18, 2024
A matrix of parameters in Stan Integral Modeling fitting-issues	41	1569	November 25, 2019

Estimating parameters of a SEIR model

Related topics