Hi,
Two responses (percentages) A and B should make a sum of 100%. But they don’t as there is always a missing part M, like M = 1-Y1-Y2. Although Y1 & Y2 are measured M is not so I’m thinking that I should treat it as parameter. Both Y1 & Y2 have the same predictors and I would like to get the predictors effect on Y1, Y2 and M.
Beta regression seems to be appropriate and I have the below code.
Essentially is the code found here just replicated for two responses Y1 and Y2, and the parameter M.
I have M in the transformed parameters
block where I say that M = 1-Y1-Y2
The model run quickly and smoothly but the M_res have an n_eff = 2 and all the parameters I have declared have the same estimates e.g. beta_prior[1] = Y1_beta[1] = Y2_beta[1] = M_beta[1]
data {
int<lower=1> N; // sample size
int<lower=1> K; // K predictors
vector<lower=0,upper=1>[N] Y1_res; // responses
vector<lower=0,upper=1>[N] Y2_res;
matrix[N,K] X; // predictor matrix
}
parameters {
vector[K] beta_prior; // reg coefficients
vector<lower=0>[K] phi_prior; // dispersion parameter
}
transformed parameters{
vector[K] Y1_beta = beta_prior * 1;
vector[K] Y1_betY1_phi= phi_prior * 1;
vector[K] Y2_beta = beta_prior * 1;
vector[K] Y2_betY1_phi= phi_prior * 1;
vector[K] M_beta = beta_prior * 1;
vector[K] M_betY1_phi = phi_prior * 1;
// Leftover amount is a parameter C = 1 - A - B
vector<lower=0,upper=1>[N] M_res = 1 - Y1_res - Y2_res;
}
model {
vector[N] Y1_LP; // linear predictor
vector[N] Y1_LP_phi;
vector[N] Y1_mu; // transformed linear predictor
vector[N] Y1_phi;
vector[N] Y1_A; // parameter for beta distn
vector[N] Y1_B; // parameter for beta distn
vector[N] Y2_A;
vector[N] Y2_B;
vector[N] Y2_LP;
vector[N] Y2_LP_phi;
vector[N] Y2_mu;
vector[N] Y2_phi;
vector[N] M_A;
vector[N] M_B;
vector[N] M_LP;
vector[N] M_LP_phi;
vector[N] M_mu;
vector[N] M_phi;
// A part
Y1_LP = X * Y1_beta;
Y1_LP_phi = X * Y1_betY1_phi;
for (i in 1:N) {
Y1_mu[i] = inY1_logit(Y1_LP[i]);
}
for (i in 1:N) {
Y1_phi[i] = exp(Y1_LP_phi[i]);
}
Y1_A = Y1_mu .* Y1_phi;
Y1_B = (1.0 - Y1_mu) .* Y1_phi;
// B part
Y2_LP = X * Y2_beta;
Y2_LP_phi = X * Y2_betY1_phi;
for (i in 1:N) {
Y2_mu[i] = inY1_logit(Y2_LP[i]);
}
for (i in 1:N) {
Y2_phi[i] = exp(Y2_LP_phi[i]);
}
Y2_A = Y2_mu .* Y2_phi;
Y2_B = (1.0 - Y2_mu) .* Y2_phi;
// Missing C
M_LP = X * M_beta;
M_LP_phi = X * M_betY1_phi;
for (i in 1:N) {
M_mu[i] = inY1_logit(M_LP[i]);
}
for (i in 1:N) {
M_phi[i] = exp(M_LP_phi[i]);
}
M_A = M_mu .* M_phi;
M_B = (1.0 - M_mu) .* M_phi;
// priors
beta_prior ~ normal(0, 1);
phi_prior ~ normal(0, 1);
// likelihood
Y1_res ~ beta(Y1_A, Y1_B);
Y2_res ~ beta(Y2_A, Y2_B);
M_res ~ beta(M_A, M_B);
}
I think a causal diagram should be like the following (X are predictors in randomized experiments)
X → Y1 → M
X → Y2 → M
So, M seems to be a collider but I hope that since I treat it as parameter I can get away of this (or might not).
Maybe a better DAG is one with the measurement error like that , and in this case it should be
M_real = 1 - Y1_real - Y2_real
Working with normal distribution seems straightforward but with Beta I have no clue how this could work out.
Any ideas of how to treat the response M as parameter and estimate the predictors effect on Y1, Y2 and M it would be great (or any other feedback).
Thanks