 # How to extract Logistic regression predicted values for accuracy assessment

I have fitted a Gaussian Process Logistic Regression but I am not sure how to extract the predicted values out of the Stan object to be compared to real values needed for evaluating the accuracy of the model. Here is the full code:

``````rm(list = ls())
library(rstan)
library(dplyr)
library(ggplot2)
library(boot)

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

#Simulating small data
T <- 40
set.seed(123)
x_1 <- sort(runif(T, 0, 10)) #Predictor 1
x_2 <- sort(runif(T, 0, 10)) #Predictor 2
alpha <- 1
beta_1 <- 0.2
beta_2 <- -0.5
logit_p <- alpha + beta_1 * x_1 + beta_2 * x_2
p <- inv.logit(logit_p)
y <- rbinom(T, 1, p)

xx <- cbind(x_1, x_2) #Predictors matrix

model_code <- "
data {
int<lower=1> N1;
int<lower=1> D;
vector[D] x1[N1];
int<lower=0, upper=1> z1[N1];
int<lower=1> N2;
vector[D] x2[N2];
}
transformed data {
real delta = 1e-9;
int<lower=1> N = N1 + N2;
vector[D] x[N];
for (n1 in 1:N1) x[n1] = x1[n1];
for (n2 in 1:N2) x[N1 + n2] = x2[n2];
}
parameters {
real<lower=0> rho;
real<lower=0> alpha;
real a;
vector[N] eta;
}
transformed parameters {
vector[N] f;
{
matrix[N, N] L_K;
matrix[N, N] K = cov_exp_quad(x, alpha, rho);

// diagonal elements
for (n in 1:N)
K[n, n] = K[n, n] + delta;

L_K = cholesky_decompose(K);
f = L_K * eta;
}
}
model {
rho ~ inv_gamma(5, 5);
alpha ~ std_normal();
a ~ std_normal();
eta ~ std_normal();

z1 ~ bernoulli_logit(a + f[1:N1]);
}
generated quantities {
int z2[N2];
for (n2 in 1:N2)
z2[n2] = bernoulli_logit_rng(a + f[N1 + n2]);
}
"

model_data <- list(
N1 = T,
N2 = T,
D = 2,
x1 = xx,
x2 = xx,
z1 = y
)

stan_fit <- stan(
data = model_data,
model_code = model_code
)
``````

I need to compare z2 from the above model to be compared to y (real values). Can someone help me extract that from the stan_fit object?

1 Like

Hi,
yes - you can use `as.array` or `as.matrix` (Create array, matrix, or data.frame objects from samples in a <code>stanfit</code> object — as.array • rstan) that is builtin into `rstan`. The `posterior` package then provides some additional formats, that might be easier to work with via `as_draws_matrix` or `as_draws_rvars`.

Best of luck with your model!