Runtime error for Multi-gaussian sampling


#1

Hi all,
I am trying to run through multi-gaussian linear model to pySTAN for the first time.
The actual data has a shape of (R, Q) and the ‘fit’ is a dot product of two matrices Theta(with shape of (R, P)) and Phi(with shape of (P,Q)) : Data ~ Theta*Phi = fit.
Here, I am trying to MCMC over all parameters in Theta and Phi, so I let STAN to MCMC over all free parameters vector (flattened Theta and Phi matrix) then in order to run through the likelihood function, I had to change vectors to be two matrices(Theta and Phi) back , since log likelihood ~ (Data-Fit)**2.

But I am getting runtime errors. Is there any way to fix this? Thanks!

CODE:

data {
    int<lower=0> N;  // number of flatten data points
    int<lower=0> M; // number of flatten parameters
    
    int<lower=0> P; // 
    int<lower=0> Q; // 
    int<lower=0> R; // 
    
    vector[N] y; // "flattened" data points
    matrix[N, N] err; // "data points; freqsflattened" Agnostic error matrix
}

parameters {
    vector[N] params; //fit value

}

transformed parameters {
    vector[R*P] theta; 
    vector[R*Q] phi;
    vector[N] flat_fit;
    
    matrix[R, P] theta_mat;
    matrix[P, Q] phi_mat;
    matrix[R, Q] fit;
    
    theta = params[:P*R]; //principal comp. param fit
    phi = params[P*R:]; // map coeff
    
    theta_mat = to_matrix(theta, R, P);
    phi_mat = to_matrix(phi, P, Q);
    
    
    fit <- (theta_mat * phi_mat);
    flat_fit = to_vector(fit);

}


model {
    // likelihood
    
    y ~multi_normal(flat_fit, err);
    
}
---------------------------------------------------------------------------
**errors**

RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/averyk/anaconda3/lib/python3.5/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/averyk/anaconda3/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "stanfit4anon_model_3492f6e30875d34f99ce74e725e664e5_4243185647824020417.pyx", line 368, in stanfit4anon_model_3492f6e30875d34f99ce74e725e664e5_4243185647824020417._call_sampler_star (/tmp/tmpeu4ohrw3/stanfit4anon_model_3492f6e30875d34f99ce74e725e664e5_4243185647824020417.cpp:7907)
  File "stanfit4anon_model_3492f6e30875d34f99ce74e725e664e5_4243185647824020417.pyx", line 401, in stanfit4anon_model_3492f6e30875d34f99ce74e725e664e5_4243185647824020417._call_sampler (/tmp/tmpeu4ohrw3/stanfit4anon_model_3492f6e30875d34f99ce74e725e664e5_4243185647824020417.cpp:8547)
RuntimeError: Initialization failed.

/home/averyk/anaconda3/lib/python3.5/site-packages/pystan/api.py in stan(file, model_name, model_code, fit, data, pars, chains, iter, warmup, thin, init, seed, algorithm, control, sample_file, diagnostic_file, verbose, boost_lib, eigen_lib, n_jobs, **kwargs)
    393                      sample_file=sample_file, diagnostic_file=diagnostic_file,
    394                      verbose=verbose, algorithm=algorithm, control=control,
--> 395                      n_jobs=n_jobs, **kwargs)
    396     return fit

/home/averyk/anaconda3/lib/python3.5/site-packages/pystan/model.py in sampling(self, data, pars, chains, iter, warmup, thin, seed, init, sample_file, diagnostic_file, verbose, algorithm, control, n_jobs, **kwargs)
    719         call_sampler_args = izip(itertools.repeat(data), args_list, itertools.repeat(pars))
    720         call_sampler_star = self.module._call_sampler_star
--> 721         ret_and_samples = _map_parallel(call_sampler_star, call_sampler_args, n_jobs)
    722         samples = [smpl for _, smpl in ret_and_samples]
    723 

/home/averyk/anaconda3/lib/python3.5/site-packages/pystan/model.py in _map_parallel(function, args, n_jobs)
     78             n_jobs = None
     79         with multiprocessing.Pool(processes=n_jobs) as pool:
---> 80             map_result = pool.map(function, args)
     81     else:
     82         map_result = list(map(function, args))

/home/averyk/anaconda3/lib/python3.5/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    258         in a list that is returned.
    259         '''
--> 260         return self._map_async(func, iterable, mapstar, chunksize).get()
    261 
    262     def starmap(self, func, iterable, chunksize=None):

/home/averyk/anaconda3/lib/python3.5/multiprocessing/pool.py in get(self, timeout)
    606             return self._value
    607         else:
--> 608             raise self._value
    609 
    610     def _set(self, i, obj):

RuntimeError: Initialization failed.

[Edit: formatting code]


#2

This looks like a Python issue—no matter what you’re doing you shouldn’t be getting error messages back like that. I’m not sure if the initialization it’s talking about is in Stan or elsewhere. I wouldn’t be surprise if you fail to init with all that data munging in Stan—you should build it up by pieces and check it as you go. For instance, your matrix err needs to be a covariance matrix (positive definite, not just semi-definite) and P * R has to be smaller than N. The bounds ar einclusive, so you’re including the parameter at P * R exactly twice—so maybe phi wants to start one beyond.

@ariddell or @ahartikainen – any idea what’s going on with the error message here?


#3

In addition to what Bob said. Could it be that you are mixing up the N and M in some parts of the code? Based on your code comments, I would expect vector[M] params.


#4

The RemoteTraceback and its weird format comes probably either from with multiprocessing or your IDE. That multiprocessing format was temporary in either 2.15/2.16 release and replaced with try...finally. Please update your PyStan. If it’s IDE issue (IPython?), then try to run model from command line with python.


#5

As you said, err is the covariance matrix and PR < N. I am using PR to cut down the flatten arrays making first part Theta and second part Phi. I tried to move Phi one beyond but got the same error…


#6

Thanks good point. I fixed the code but have same problems…


#7

After updating my 2.15 version, I am getting the new error message:

RuntimeError: Exception: assign: Rows of left-hand-side (100) and rows of right-hand-side (12) must match in size (in ‘unkown file name’ at line 29)

I don’t understand why “fit <- (theta_mat * phi_mat)” gives out size of 12…
Shouldn’t “*” perform as np.dot in stan?

Thanks everyone for helpful comments!


#8

In Stan, * acts like matrix multiplication in mathematics. If a is a J x K matrix and b is K x L , then a * b is a J x L matrix. If they are vectors, they also do the right thing, such as a matrix multipled by a vector being a vector. All dimensions have to match in the usual way.


#9

Thank you so much for the clarification!