Hello, everybodyy!
First of all, forgive my language mistakes, but I’ll try to explain my doubt.

I’m trying to estimate parameters of a categorical logistic regression and found the Stan’s Softmax Regression.
I know how the theoretical model works and its problems with non identifiabilty, but I’m not concerning about it in this step, just want to do the estimations to verify if the code is running ok.

I believe I’m “following the instructions” about data and parameters that are presented in Stan’s manual, but for some reason that I can’t understand, I have the SYNTAX ERROR

``````SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:

categorical_logit_glm_lpmf(int[ ], matrix, vector, matrix)

``````

The reproducible code, with generated data, is below. I don’t mind if the estimations are not good, I’ll implement parameters restrictions after. First, I need to understand where I’m going wrong :/

Thanks!

``````# Softmax -----------------------------------------------------------------

softmax <- function(par){
n.par <- length(par)
par1 <- sort(par, decreasing = TRUE)
Lk <- par1[1]
for (k in 1:(n.par-1)) {
Lk <- max(par1[k+1], Lk) + log1p(exp(-abs(par1[k+1] - Lk)))
}
val <- exp(par - Lk)
return(val)
}

# Parameters and Data ------------------------------------------------------

set.seed(123) # Setting seed

K <- 4 # Possible outcomes
N <- 1000 # Sample size
D <- 2 # Dimension of X

X <- matrix(c(rnorm(N, 5, 7), # X1
rbinom(N, 1, .7)), # X2
ncol=D, byrow=F) %>%
`colnames<-`(c("X1", "X2")) # Covariates

alpha <- matrix(c(1, 1.2, 1.3, 1.2), ncol = 1) # Intercepts for categories

beta <- matrix(c(2.5, 0.4, # Parameters for category 1
2.3, 2, # Parameters for category 2
0.8, 1.5, # Parameters for category 3
2.2, 2.7), # Parameters for category 4
nrow=2, byrow=F)

eta <- (X %*% beta %>% sweep (., 2, alpha %>% t, "+")) %>%  # Sweep adds intercepts in each line of matrix X %*% beta
`colnames<-`(c("eta1", "eta2", "eta3", "eta4")) # Linear predictors for each set of independent variables

pi <- eta %>%
apply(., 1, softmax) %>%
t()  %>%
`colnames<-`(c("pi1", "pi2", "pi3", "pi4")) # Probabilities for each set of independent variables

Y <- apply(pi, 1, rcat, n=1) # Categorical answers

Y %>% table %>% prop.table # Categories frequencies

``````
``````stan_code <- "

data{
int<lower = 1> N; // Sample size
int<lower = 1> K; // Categories
int<lower = 1> D; // Covariates dimension
matrix [N, D] X; // Covariates
}

parameters{
matrix [D, K] beta;
vector [K] alpha;
}

model{
target += categorical_logit_glm_lpmf(Y | X, alpha, beta);
}

"

stan_data <- list(N = N, K = K, D = D, Y = Y, X = X)

fit <- stan(model_code = stan_code, data = stan_data, iter = 500, chains = 1, control = list(max_treedepth = 10))

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:

categorical_logit_glm_lpmf(int[ ], matrix, vector, matrix)

error in 'model1c107b9a2e6b_80b97f55ecfaaa509275cce0d7d3f05b' at line 17, column 58
-------------------------------------------------
15:
16: model{
17:   target += categorical_logit_glm_lpmf(Y | X, alpha, beta);
^
-------------------------------------------------``````

Hi Nicholas,

Unfortunately the `categorical_logit_glm_lpmf` isn’t available in the version of `rstan` currently on CRAN (it’s a few stan versions behind). Try installing the preview of the next version of rstan which should have this function available:

``````remove.packages(c("StanHeaders", "rstan"))