# How to link mixing proportion (theta) with a covariate using a link function?

I have a mixture model of a von Mises-Fisher distribution and a uniformly distributed spherical surface (\frac{1}{4\pi}) for noise. In the model below, I use theta to define the mixing proportion of these two components.

functions{
real vmf_lpdf(row_vector y, row_vector mu, real kappa);
}
data {
int<lower=0> N; //the number of observations
vector[N] X; //the model column vector of size N
matrix[N,3] y; //the multivariate response with N rows and 3 columns
}
parameters {
simplex theta;
unit_vector mu_vec;
}
model {
real ps;
for (n in 1:N) {
ps = log(theta) + vmf_lpdf(y[n] | mu_vec', exp(X[n]*kappa_link));
ps = log(theta) + log(1/(4*pi()));
target += log_sum_exp(ps);
}
}


The model above seems to work well. I want to extend the model such that theta is a function of X (similar to kappa_link above). I am not sure what is the best way to do it. In the model below, I tried to set theta as a function of X using an inv_logit function:

functions{
real vmf_lpdf(row_vector y, row_vector mu, real kappa);
}
data {
int<lower=0> N; //the number of observations
vector[N] X; //the model column vector of size N
matrix[N,3] y; //the multivariate response with N rows and 3 columns
}
parameters {
unit_vector mu_vec;
}
model {
real ps;
for (n in 1:N) {
target += log_sum_exp(ps);
}
}


However, it is not giving stable results. Even when I removed X[n] from the link function, such that inv_logit(theta_link), I don’t get stable outcomes as in the first model above that uses simplex.

What is a good way to link theta to X? If there are any other problems with my model, please let me know too. Thank you!

I have also tried with the following model but it gives segfault when running:

functions{
real vmf_lpdf(row_vector y, row_vector mu, real kappa);
}
data {
int<lower=0> N; //the number of observations
vector[N] X; //the model column vector of size N
matrix[N,3] y; //the multivariate response with N rows and 3 columns
}
parameters {
unit_vector mu_vec;
}
model {
real ps;
for (n in 1:N) {
ps = log(theta) + vmf_lpdf(y[n] | mu_vec', exp(X[n]*kappa_link));
ps = log(theta) + log(1/(4*pi()));
target += log_sum_exp(ps);
}
}


The error message is not that informative:

 *** caught segfault ***

Traceback:
1: .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: 0x55c181128b90>,     dll = list(name = "Rcpp", path = "/home/satya/R/x86_64-pc-linux-gnu-library/4.0/Rcpp/libs/Rcpp.so",         dynamicLookup = TRUE, handle = <pointer: 0x55c185336f30>,         info = <pointer: 0x55c180294850>), numParameters = -1L),     <pointer: 0x55c1868b2e30>, <pointer: 0x55c183111740>, .pointer,     ...)
2: sampler$call_sampler(args_list[[i]]) 3: doTryCatch(return(expr), name, parentenv, handler) 4: tryCatchOne(expr, names, parentenv, handlers[[1L]]) 5: tryCatchList(expr, classes, parentenv, handlers) 6: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call)[1L] prefix <- paste("Error in", dcall, ": ") LONG <- 75L sm <- strsplit(conditionMessage(e), "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste0(prefix, "\n ") } else prefix <- "Error : " msg <- paste0(prefix, conditionMessage(e), "\n") .Internal(seterrmessage(msg[1L])) if (!silent && isTRUE(getOption("show.error.messages"))) { cat(msg, file = outFile) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))}) 7: try(sampler$call_sampler(args_list[[i]]))
8: .local(object, ...)
9: sampling(model, data = list(X = X, N = N, y = y), chains = 1,     iter = 2000)
10: sampling(model, data = list(X = X, N = N, y = y), chains = 1,     iter = 2000)
An irrecoverable exception occurred. R is aborting now ...


Can someone help me with this please?

Ok now that I have solved the segfault issue by recompiling rstan (Segfault when using unit_vector and softmax), I can run the following model (A):

functions{
real vmf_lpdf(row_vector y, row_vector mu, real kappa);
}
data {
int<lower=0> N; //the number of observations
vector[N] X; //the model column vector of size N
matrix[N,3] y; //the multivariate response with N rows and 3 columns
}
parameters {
unit_vector mu_vec;
}
model {
real ps;
vector theta;
for (n in 1:N) {
ps = log(theta) + vmf_lpdf(y[n] | mu_vec', exp(X[n]*kappa_link));
ps = log(theta) + log(1/(4*pi()));
target += log_sum_exp(ps);
}
}


but I am getting very different estimation results from just using a simplex as in this model (B):

functions{
real vmf_lpdf(row_vector y, row_vector mu, real kappa);
}
data {
int<lower=0> N; //the number of observations
vector[N] X; //the model column vector of size N
matrix[N,3] y; //the multivariate response with N rows and 3 columns
}
parameters {
simplex theta;
unit_vector mu_vec;
}
model {
real ps;
for (n in 1:N) {
ps = log(theta) + vmf_lpdf(y[n] | mu_vec', exp(X[n]*kappa_link));
ps = log(theta) + log(1/(4*pi()));
target += log_sum_exp(ps);
}
}


Model B with simplex gives better estimations. How can I change model A to give similar estimations as in B?

I believe the default prior on a 2-plex is a dirichlet with \alpha = [1,1]' which corresponds to a uniform distribution over the [0,1] interval. Whereas a normal(0, 2) prior on softmax corresponds to a horseshoe shape.

See

library(data.table)
softmax <- function(x) exp(x) / sum(exp(x))
test <- data.table( a = rnorm(1000, mean = 0, sd = 2), b = rnorm(1000, mean = 0, sd = 2))
out_sm <- apply(test, 1, softmax)
hist(out_sm)

out_dir <- rdirichlet(1000, c(1, 1))
hist(out_dir)


Output of 1000 draws from two independent normal(0, 2)

Output of 1000 draws from a dirichlet with alpha = [1,1].

2 Likes