How to extract Logistic regression predicted values for accuracy assessment

Aminsn · September 18, 2021, 10:08pm

I have fitted a Gaussian Process Logistic Regression but I am not sure how to extract the predicted values out of the Stan object to be compared to real values needed for evaluating the accuracy of the model. Here is the full code:

rm(list = ls())
library(rstan)
library(dplyr)
library(ggplot2)
library(boot)

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

#Simulating small data
T <- 40
set.seed(123)
x_1 <- sort(runif(T, 0, 10)) #Predictor 1
x_2 <- sort(runif(T, 0, 10)) #Predictor 2
alpha <- 1
beta_1 <- 0.2
beta_2 <- -0.5
logit_p <- alpha + beta_1 * x_1 + beta_2 * x_2
p <- inv.logit(logit_p)
y <- rbinom(T, 1, p)

xx <- cbind(x_1, x_2) #Predictors matrix


model_code <- "
data {
  int<lower=1> N1;
  int<lower=1> D;
  vector[D] x1[N1];
  int<lower=0, upper=1> z1[N1];
  int<lower=1> N2;
  vector[D] x2[N2];
}
transformed data {
  real delta = 1e-9;
  int<lower=1> N = N1 + N2;
  vector[D] x[N];
  for (n1 in 1:N1) x[n1] = x1[n1];
  for (n2 in 1:N2) x[N1 + n2] = x2[n2];
}
parameters {
  real<lower=0> rho;
  real<lower=0> alpha;
  real a;
  vector[N] eta;
}
transformed parameters {
  vector[N] f;
  {
    matrix[N, N] L_K;
    matrix[N, N] K = cov_exp_quad(x, alpha, rho);

    // diagonal elements
    for (n in 1:N)
      K[n, n] = K[n, n] + delta;

    L_K = cholesky_decompose(K);
    f = L_K * eta;
  }
}
model {
  rho ~ inv_gamma(5, 5);
  alpha ~ std_normal();
  a ~ std_normal();
  eta ~ std_normal();

  z1 ~ bernoulli_logit(a + f[1:N1]);
}
generated quantities {
  int z2[N2];
  for (n2 in 1:N2)
    z2[n2] = bernoulli_logit_rng(a + f[N1 + n2]);
}
"


model_data <- list(
  N1 = T, 
  N2 = T,
  D = 2,
  x1 = xx,
  x2 = xx,
  z1 = y
)

stan_fit <- stan(
  data = model_data,
  model_code = model_code
)

I need to compare z2 from the above model to be compared to y (real values). Can someone help me extract that from the stan_fit object?

martinmodrak · September 24, 2021, 6:24pm

Hi,
yes - you can use as.array or as.matrix (Create array, matrix, or data.frame objects from samples in a <code>stanfit</code> object — as.array • rstan) that is builtin into rstan. The posterior package then provides some additional formats, that might be easier to work with via as_draws_matrix or as_draws_rvars.

Best of luck with your model!

Topic		Replies	Views
Gaussian Process Prediction Modeling rstan , techniques , fitting-issues	2	484	March 22, 2021
Prediction using point estimates in rstan Modeling	13	1658	October 22, 2017
Display predicted values of gp Modeling gaussian-process	2	828	October 10, 2017
Efficient implementation of the for loop? Modeling rstan , fitting-issues	1	596	February 18, 2022
Help Extracting Fitted Values from CmdStanR Modeling bayesplot , cmdstanr , posterior-predictive	1	1306	January 13, 2021

How to extract Logistic regression predicted values for accuracy assessment

Related topics