Hello, everybodyy!
First of all, forgive my language mistakes, but I’ll try to explain my doubt.
I’m trying to estimate parameters of a categorical logistic regression and found the Stan’s Softmax Regression.
I know how the theoretical model works and its problems with non identifiabilty, but I’m not concerning about it in this step, just want to do the estimations to verify if the code is running ok.
I believe I’m “following the instructions” about data and parameters that are presented in Stan’s manual, but for some reason that I can’t understand, I have the SYNTAX ERROR
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:
categorical_logit_glm_lpmf(int[ ], matrix, vector, matrix)
Function categorical_logit_glm_lpmf not found.
The reproducible code, with generated data, is below. I don’t mind if the estimations are not good, I’ll implement parameters restrictions after. First, I need to understand where I’m going wrong :/
Thanks!
# Softmax -----------------------------------------------------------------
softmax <- function(par){
n.par <- length(par)
par1 <- sort(par, decreasing = TRUE)
Lk <- par1[1]
for (k in 1:(n.par-1)) {
Lk <- max(par1[k+1], Lk) + log1p(exp(-abs(par1[k+1] - Lk)))
}
val <- exp(par - Lk)
return(val)
}
# Parameters and Data ------------------------------------------------------
set.seed(123) # Setting seed
K <- 4 # Possible outcomes
N <- 1000 # Sample size
D <- 2 # Dimension of X
X <- matrix(c(rnorm(N, 5, 7), # X1
rbinom(N, 1, .7)), # X2
ncol=D, byrow=F) %>%
`colnames<-`(c("X1", "X2")) # Covariates
alpha <- matrix(c(1, 1.2, 1.3, 1.2), ncol = 1) # Intercepts for categories
beta <- matrix(c(2.5, 0.4, # Parameters for category 1
2.3, 2, # Parameters for category 2
0.8, 1.5, # Parameters for category 3
2.2, 2.7), # Parameters for category 4
nrow=2, byrow=F)
eta <- (X %*% beta %>% sweep (., 2, alpha %>% t, "+")) %>% # Sweep adds intercepts in each line of matrix X %*% beta
`colnames<-`(c("eta1", "eta2", "eta3", "eta4")) # Linear predictors for each set of independent variables
pi <- eta %>%
apply(., 1, softmax) %>%
t() %>%
`colnames<-`(c("pi1", "pi2", "pi3", "pi4")) # Probabilities for each set of independent variables
Y <- apply(pi, 1, rcat, n=1) # Categorical answers
Y %>% table %>% prop.table # Categories frequencies
stan_code <- "
data{
int<lower = 1> N; // Sample size
int<lower = 1> K; // Categories
int<lower = 1> D; // Covariates dimension
int Y[N]; // Answers
matrix [N, D] X; // Covariates
}
parameters{
matrix [D, K] beta;
vector [K] alpha;
}
model{
target += categorical_logit_glm_lpmf(Y | X, alpha, beta);
}
"
stan_data <- list(N = N, K = K, D = D, Y = Y, X = X)
fit <- stan(model_code = stan_code, data = stan_data, iter = 500, chains = 1, control = list(max_treedepth = 10))
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:
categorical_logit_glm_lpmf(int[ ], matrix, vector, matrix)
Function categorical_logit_glm_lpmf not found.
error in 'model1c107b9a2e6b_80b97f55ecfaaa509275cce0d7d3f05b' at line 17, column 58
-------------------------------------------------
15:
16: model{
17: target += categorical_logit_glm_lpmf(Y | X, alpha, beta);
^
-------------------------------------------------