How to extract Logistic regression predicted values for accuracy assessment

I have fitted a Gaussian Process Logistic Regression but I am not sure how to extract the predicted values out of the Stan object to be compared to real values needed for evaluating the accuracy of the model. Here is the full code:

rm(list = ls())
library(rstan)
library(dplyr)
library(ggplot2)
library(boot)

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

#Simulating small data
T <- 40
set.seed(123)
x_1 <- sort(runif(T, 0, 10)) #Predictor 1
x_2 <- sort(runif(T, 0, 10)) #Predictor 2
alpha <- 1
beta_1 <- 0.2
beta_2 <- -0.5
logit_p <- alpha + beta_1 * x_1 + beta_2 * x_2
p <- inv.logit(logit_p)
y <- rbinom(T, 1, p)

xx <- cbind(x_1, x_2) #Predictors matrix


model_code <- "
data {
  int<lower=1> N1;
  int<lower=1> D;
  vector[D] x1[N1];
  int<lower=0, upper=1> z1[N1];
  int<lower=1> N2;
  vector[D] x2[N2];
}
transformed data {
  real delta = 1e-9;
  int<lower=1> N = N1 + N2;
  vector[D] x[N];
  for (n1 in 1:N1) x[n1] = x1[n1];
  for (n2 in 1:N2) x[N1 + n2] = x2[n2];
}
parameters {
  real<lower=0> rho;
  real<lower=0> alpha;
  real a;
  vector[N] eta;
}
transformed parameters {
  vector[N] f;
  {
    matrix[N, N] L_K;
    matrix[N, N] K = cov_exp_quad(x, alpha, rho);

    // diagonal elements
    for (n in 1:N)
      K[n, n] = K[n, n] + delta;

    L_K = cholesky_decompose(K);
    f = L_K * eta;
  }
}
model {
  rho ~ inv_gamma(5, 5);
  alpha ~ std_normal();
  a ~ std_normal();
  eta ~ std_normal();

  z1 ~ bernoulli_logit(a + f[1:N1]);
}
generated quantities {
  int z2[N2];
  for (n2 in 1:N2)
    z2[n2] = bernoulli_logit_rng(a + f[N1 + n2]);
}
"


model_data <- list(
  N1 = T, 
  N2 = T,
  D = 2,
  x1 = xx,
  x2 = xx,
  z1 = y
)

stan_fit <- stan(
  data = model_data,
  model_code = model_code
)

I need to compare z2 from the above model to be compared to y (real values). Can someone help me extract that from the stan_fit object?

1 Like

Hi,
yes - you can use as.array or as.matrix (Create array, matrix, or data.frame objects from samples in a <code>stanfit</code> object — as.array • rstan) that is builtin into rstan. The posterior package then provides some additional formats, that might be easier to work with via as_draws_matrix or as_draws_rvars.

Best of luck with your model!