BCCA_exponential family model

ENFP99 · February 11, 2018, 11:39pm

I try to reproduce the code posted a while ago about latent dimension model. After correcting a few minor mistakes, I encounter the same problem with the user who posted the code. I wonder whether this problem has been solved or not? Why when the iteration reaches 300+, the estimates of A and B matrices will be filled with very small numbers close to 0 while Z matrix have very large values? I can see the product of ZA and ZB seem to meet with the truth, but the individual estimates for A, B and Z do not occur right to me. I wonder whether it is a problem of the code or the problem of the BCCA method itself?

# Create Simulated Data 
options(scipen=999)
N=100
n1=2
n2=3
d=2
set.seed(20004)
A=matrix(rnorm(n1*d),byrow=TRUE,ncol=n1)
set.seed(20013)
B=matrix(rnorm(n2*d),ncol=n2,byrow=TRUE)

d=nrow(A)
al0=.1
bet0=10
sig_te=1
sig_lb=1

library(MASS)
set.seed(2001)
Z <- mvrnorm(N, rep(1,d), diag(rep(1,d))) ## latent variable
set.seed(2003)
teta=Z%*%A + mvrnorm(nrow(Z), rep(0,ncol(A)), diag(rep(sig_te,ncol(A))))## construct the mean of the response X (counts)
set.seed(2004)
lambd=Z%*%B + mvrnorm(nrow(Z), rep(0,ncol(B)), diag(rep(sig_lb,ncol(B)))) ## construct the mean of the response X (counts)
X=apply(exp(teta),c(1,2),rpois,n=1) ## Simulate a matrix of counts
Y=apply(exp(lambd),c(1,2),rpois,n=1) ## Simulate a matrix of counts

plot(c(exp(lambd)),c(Y))
abline(0,1)


##### stan code
library(rstan)
library(parallel)
library(coda)
Bcca_mod <- '
data {
int<lower=1> N; // sample size
int<lower=1> n1;
int<lower=1> n2;
int<lower=1> d;
int X[N,n1];
int Y[N,n2];
}
parameters {
real<lower=0> bet[d];
real<lower=0> sig_te;
real<lower=0> sig_lb;
matrix[N,d] Z;
matrix[d,n1] A;
matrix[d,n2] B;
matrix<lower=0>[N,n1] teta;
matrix<lower=0>[N,n2] lambd;
}
transformed parameters {
matrix[N,n1] mu_te;
matrix[N,n2] mu_lb;
real<lower=0> sigte_sr;
real<lower=0> siglb_sr;
real<lower=0> beta_sqt[d];

mu_te <- Z*A;
mu_lb <- Z*B;
siglb_sr <- sqrt(sig_lb);
sigte_sr <- sqrt(sig_te);
for(i in 1:d)
beta_sqt[i] <- sqrt(bet[i]);
}
model {
bet ~ inv_gamma(1, 10);
sig_lb ~ scaled_inv_chi_square(1, .1) ;
sig_te ~ scaled_inv_chi_square(1, .1) ;
for(i in 1:d)
for(j in 1:n1)
A[i,j] ~ normal(0, beta_sqt[j]);

for(i in 1:d)
for(j in 1:n1)
B[i,j] ~ normal(0, beta_sqt[j]);

for(i in 1:N)
{
for(j in 1:n1)
{
log(teta[i,j]) ~ normal(mu_te[i,j], sigte_sr);
lp__ <- lp__ - log(fabs(teta[i,j]));
X[i,j] ~ poisson(teta[i,j]);
}
for(k in 1:n2)
{
log(lambd[i,k]) ~ normal(mu_lb[i,k], siglb_sr);
lp__ <- lp__ - log(fabs(lambd[i,k]));
Y[i,k] ~ poisson(lambd[i,k]);
}
}
}
'
bcca_dat<-list("X"=X,"Y"=Y,"n1"=n1,"n2"=n2,"N"=N,"d"=d)
fit <- stan(model_code = Bcca_mod, data = bcca_dat, thin=3, iter = 200, chains =2 ,pars=c("A","B","Z","sig_te","sig_lb","bet"))

## Output (Under iteration 300)
> A_hat
             [,1]          [,2]
[1,] -0.001098217 -0.0001571005
[2,]  0.001380701 -0.0010627120
> B_hat
             [,1]          [,2]         [,3]
[1,]  0.002245403 -0.0001312695 -0.002329139
[2,] -0.003623535 -0.0002732766  0.003810004
> Z_hat
              [,1]          [,2]
  [1,] -162.533006  -208.4626989
  [2,]   25.221645   -79.5542877
  [3,] -345.385609    24.3618739
  [4,]  680.458580  -793.9105445
  [5,]  343.162535  -391.7569286
  [6,]  278.840499  -374.5201578
  [7,]  170.331929  -548.6439287
  [8,]  339.339055   324.3885500
  [9,]   45.504859  -253.4634734
 [10,] -328.863567    93.7764410
 [11,]  261.060012  -721.3360895
 [12,]  -44.853617  -264.4209098
 [13,]  159.631155   -84.4126174
 [14,]  371.280534    12.3856587
 [15,]   57.965974  -202.6258143
 [16,]  303.591646  -119.2751380
 [17,]  -95.477896  -306.0774238
 [18,]  142.481589  -491.1263385
 [19,]   55.526641  -553.4491354
 [20,] -691.701114   -76.6924523
 [21,] -239.937470  -279.2170494
 [22,] -278.746118  -170.7140637

Anybody can answer this question? Thanks!

[edit: escaped code]

Bob_Carpenter · February 13, 2018, 9:38pm

My guess is that it’s because you only have priors on A and B, not on Z. When you multiply two elements, their scales aren’t identified other than by the prior, since you can divide one and multiply the other. Putting the prior on A and B will favor small values, whereas any values of Z are equal for an improper uniform prior.

Sometimes people put strong constraints on values like being unit length. Then the scale of one of the values is fully identified. Softer constraints tend to be a bit easier to compute with in most cases.

ENFP99 · February 14, 2018, 8:51pm

Hi, Bob

Thank you so much for your comments. I tried to put a prior on Z and it solved the problem!

Topic		Replies	Views
Non-convergence of latent variable model Modeling	28	6112	February 13, 2020
Incorrectly bimodal posterior Modeling techniques , fitting-issues	7	334	March 22, 2024
Help with factor analysis (latent variable model) Modeling specification	22	5218	June 11, 2018
Trying to implement a latent Gaussian model of categorical data (categorical PCA) Modeling	3	1023	November 25, 2019
Mixture Bernoulli Modeling	5	1530	May 9, 2020

BCCA_exponential family model

Related topics