BCCA_exponential family model


#1

I try to reproduce the code posted a while ago about latent dimension model. After correcting a few minor mistakes, I encounter the same problem with the user who posted the code. I wonder whether this problem has been solved or not? Why when the iteration reaches 300+, the estimates of A and B matrices will be filled with very small numbers close to 0 while Z matrix have very large values? I can see the product of ZA and ZB seem to meet with the truth, but the individual estimates for A, B and Z do not occur right to me. I wonder whether it is a problem of the code or the problem of the BCCA method itself?

# Create Simulated Data 
options(scipen=999)
N=100
n1=2
n2=3
d=2
set.seed(20004)
A=matrix(rnorm(n1*d),byrow=TRUE,ncol=n1)
set.seed(20013)
B=matrix(rnorm(n2*d),ncol=n2,byrow=TRUE)

d=nrow(A)
al0=.1
bet0=10
sig_te=1
sig_lb=1

library(MASS)
set.seed(2001)
Z <- mvrnorm(N, rep(1,d), diag(rep(1,d))) ## latent variable
set.seed(2003)
teta=Z%*%A + mvrnorm(nrow(Z), rep(0,ncol(A)), diag(rep(sig_te,ncol(A))))## construct the mean of the response X (counts)
set.seed(2004)
lambd=Z%*%B + mvrnorm(nrow(Z), rep(0,ncol(B)), diag(rep(sig_lb,ncol(B)))) ## construct the mean of the response X (counts)
X=apply(exp(teta),c(1,2),rpois,n=1) ## Simulate a matrix of counts
Y=apply(exp(lambd),c(1,2),rpois,n=1) ## Simulate a matrix of counts

plot(c(exp(lambd)),c(Y))
abline(0,1)


##### stan code
library(rstan)
library(parallel)
library(coda)
Bcca_mod <- '
data {
int<lower=1> N; // sample size
int<lower=1> n1;
int<lower=1> n2;
int<lower=1> d;
int X[N,n1];
int Y[N,n2];
}
parameters {
real<lower=0> bet[d];
real<lower=0> sig_te;
real<lower=0> sig_lb;
matrix[N,d] Z;
matrix[d,n1] A;
matrix[d,n2] B;
matrix<lower=0>[N,n1] teta;
matrix<lower=0>[N,n2] lambd;
}
transformed parameters {
matrix[N,n1] mu_te;
matrix[N,n2] mu_lb;
real<lower=0> sigte_sr;
real<lower=0> siglb_sr;
real<lower=0> beta_sqt[d];

mu_te <- Z*A;
mu_lb <- Z*B;
siglb_sr <- sqrt(sig_lb);
sigte_sr <- sqrt(sig_te);
for(i in 1:d)
beta_sqt[i] <- sqrt(bet[i]);
}
model {
bet ~ inv_gamma(1, 10);
sig_lb ~ scaled_inv_chi_square(1, .1) ;
sig_te ~ scaled_inv_chi_square(1, .1) ;
for(i in 1:d)
for(j in 1:n1)
A[i,j] ~ normal(0, beta_sqt[j]);

for(i in 1:d)
for(j in 1:n1)
B[i,j] ~ normal(0, beta_sqt[j]);

for(i in 1:N)
{
for(j in 1:n1)
{
log(teta[i,j]) ~ normal(mu_te[i,j], sigte_sr);
lp__ <- lp__ - log(fabs(teta[i,j]));
X[i,j] ~ poisson(teta[i,j]);
}
for(k in 1:n2)
{
log(lambd[i,k]) ~ normal(mu_lb[i,k], siglb_sr);
lp__ <- lp__ - log(fabs(lambd[i,k]));
Y[i,k] ~ poisson(lambd[i,k]);
}
}
}
'
bcca_dat<-list("X"=X,"Y"=Y,"n1"=n1,"n2"=n2,"N"=N,"d"=d)
fit <- stan(model_code = Bcca_mod, data = bcca_dat, thin=3, iter = 200, chains =2 ,pars=c("A","B","Z","sig_te","sig_lb","bet"))

## Output (Under iteration 300)
> A_hat
             [,1]          [,2]
[1,] -0.001098217 -0.0001571005
[2,]  0.001380701 -0.0010627120
> B_hat
             [,1]          [,2]         [,3]
[1,]  0.002245403 -0.0001312695 -0.002329139
[2,] -0.003623535 -0.0002732766  0.003810004
> Z_hat
              [,1]          [,2]
  [1,] -162.533006  -208.4626989
  [2,]   25.221645   -79.5542877
  [3,] -345.385609    24.3618739
  [4,]  680.458580  -793.9105445
  [5,]  343.162535  -391.7569286
  [6,]  278.840499  -374.5201578
  [7,]  170.331929  -548.6439287
  [8,]  339.339055   324.3885500
  [9,]   45.504859  -253.4634734
 [10,] -328.863567    93.7764410
 [11,]  261.060012  -721.3360895
 [12,]  -44.853617  -264.4209098
 [13,]  159.631155   -84.4126174
 [14,]  371.280534    12.3856587
 [15,]   57.965974  -202.6258143
 [16,]  303.591646  -119.2751380
 [17,]  -95.477896  -306.0774238
 [18,]  142.481589  -491.1263385
 [19,]   55.526641  -553.4491354
 [20,] -691.701114   -76.6924523
 [21,] -239.937470  -279.2170494
 [22,] -278.746118  -170.7140637

Anybody can answer this question? Thanks!

[edit: escaped code]


#2

My guess is that it’s because you only have priors on A and B, not on Z. When you multiply two elements, their scales aren’t identified other than by the prior, since you can divide one and multiply the other. Putting the prior on A and B will favor small values, whereas any values of Z are equal for an improper uniform prior.

Sometimes people put strong constraints on values like being unit length. Then the scale of one of the values is fully identified. Softer constraints tend to be a bit easier to compute with in most cases.


#3

Hi, Bob

Thank you so much for your comments. I tried to put a prior on Z and it solved the problem!