I don’t think I can really follow this. Could you post your current Stan code and describe where the data come from and what inferences you are trying to make, which questions are you trying to answer?
my current code is the following:
data {
int<lower=0> N; // number of data points
int <lower=1> y[N]; // observations
matrix[N,4] x; // Covariates
real<lower=0> mi[N];
}
transformed data {
real<lower=0> Ami;
Ami=mean(mi);
}
parameters {
real<lower=0> ci; // any Positive number added on the mu[1] to get mu[2]
vector[4] beta;
real alpha;
vector<lower=0, upper=1>[N] pi; // mixing proportions; need to have pi's at observation level
}
transformed parameters{
vector<lower=0>[N] mu[2]; // locations of mixture components; need to have mu's for each observation
mu[1] = exp(alpha + x*beta);
mu[2] = exp((alpha + x*beta)+ci);
}
model {
// Priors
beta ~ normal(0, 1e1);
alpha ~ normal(0,1e1);
ci ~ normal(0, 1e1);
pi ~ beta(mi,Ami); // how can we change to Dirichlet distribution using mi and mci?
// Likelihood
for (i in 1:N)
target += log_mix(pi,
poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));
}
When I run the above code, I got this error
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
No matches for:
log_mix(vector, real, real)
Available argument signatures for log_mix:
log_mix(real, real, real)
log_mix(real[], real[])
log_mix(real[], vector)
log_mix(real[], row vector)
log_mix(real[], vector[])
log_mix(real[], row vector[])
log_mix(vector, real[])
log_mix(vector, vector)
log_mix(vector, row vector)
log_mix(vector, vector[])
log_mix(vector, row vector[])
log_mix(row vector, real[])
log_mix(row vector, vector)
log_mix(row vector, row vector)
log_mix(row vector, vector[])
log_mix(row vector, row vector[])
error in 'modeld8755c72_ff542624806624524b7b715fc8665283' at line 56, column 82
-------------------------------------------------
54: target += log_mix(pi,
55: poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
56: poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));
^
57:
-------------------------------------------------
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
failed to parse Stan model 'ff542624806624524b7b715fc8665283' due to the above error.
In addition: Warning message:
In file.remove(tf) :
cannot remove file 'C:\Users\521-04\AppData\Local\Temp\Rtmpc35uis\filed82a886c34', reason 'Permission denied'
I understand the problem here that the log_mix is real and it will not consider vector mixing proportion. But I need the mixing proportion at observation level and how can i handle this.
FYI: When I change the mixing proportion to real<lower=0, upper=1> pi
it was working properly and can get the N
amount of mu[1] and mu[2]
, but could not get the mixing proportion at observation level.
You probably want
target += log_mix(pi[i], ...
Please look the following stan code. I have checked this code with small data set and it is ok. But, when I used my complete data (around 400,000 cases) it will not start the estimation yet (it’s the 5th days). Therefore, would you please give any comment or idea how can I make it fast.
data {
int<lower=0> N; // number of data points
int <lower=1> y[N]; // observations
matrix[N,4] x; // Covariates
}
parameters {
real<lower=0> ci; // any Positive number added on the mu[1] to get mu[2]
real<lower=0> d; // will have uniform distribution that balances the shape parameters
vector[4] beta;
real alpha;
vector<lower=0, upper=1>[N] pi; // mixing proportions
}
transformed parameters{
vector<lower=0>[N] mu[2]; // locations of mixture components
mu[1] = exp(alpha + x*beta);
mu[2] = exp((alpha + x*beta)+ci);
}
model {
// Priors
beta ~ normal(0, 1e3);
alpha ~ normal(0,1e3);
d ~ uniform(0,10000);
ci ~ normal(0, 1e3);
// Likelihood
for (i in 1:N)
target += log_mix(pi[i],
poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));
}
It keeps this message for the last five days
SAMPLING FOR MODEL 'af252afcc7248409611d59094236d80a' NOW (CHAIN 1).
In no particular order:
- When you have
d ~ uniform(0, 10000)
, you have to declarereal<lower=0, upper = 10000> d;
The model statement is then no longer necessary. - In general, the Stan team discourages uniform priors. For
d
you are probably better of with a gamma or exponential style prior. - Your priors are too wide. Especially alpha, beta, and ci.
exp(400)
overflows andexp(-400)
underflows. Scaling of Log-likelihood value in Hamiltonian Monte Carlo - Irrespective of the numerical issues, I would try to use more informative priors. What is the scale, you alpha, beta, and ci to be on? Can you standardize the x’s to make the betas on a scale closer to normal ~ (0, 1)?
- You can work on the log scale with
poisson_log_lpmf
which should help with numerical stability because you avoid the mu = exp() step. - I don’t understand the poisson distribution you are using but I wonder whether there are no ways to simplify the distributions in log_mix. mu[1] and mu[2] only differ by term ci. The poisson is also relatively easy to think about at the log scale.
400 000 cases is probably just too much for Stan. While there might be some improvements to speed available, I don’t think you are likely to get Stan to work with that. The problem is not primarily the number of data points (although that is big), but the number of parameters. In a small test I did elsewhere sampling from 100 000 i.i.d. normal, which is probably the easiest model you can think of, takes 40 minutes on my PC. Your model is both more than 4 times larger and waaay more complex. A simple hierarchical model with 25 000 parameters took me about 1.5 hours, and the same model with 100 000 parameters didn’t complete in 16 hours, so that’s just to provide some context.
In my experience it is however OK to split such large datasets randomly into smaller chunks that are manageable for Stan, you just need to check that the estimates of the shared parameters are similar across the chunks (which they will be unless there is some problem with your model).
Some stuff you can do to increase speed:
- Use
poisson_log_lpmf
orpoisson_log_glm_lpmf
to avoid the exponentiation - Put tighter priors on everything (especially the
uniform
prior ond
might cause some trouble), I would be surprised if anything wider thannormal(0, 5)
is actually needed for anything but the intercept, provided your predictors are normalized (which is a good idea anyway). - Check that you have enough memory - once you sart swapping to disk, you performance is dead. Note that RStan will allocate memory to store all the posterior draws from your model (which I would guess to be like 18GB/chain in your case).You can reduce that by 66% by moving
mu
fromtransformed parameters
tomodel
. You can use CmdStan to avoid storing draws in memory at all. You may also consider running fewer chains in parallel. - Store the result of
alpha + x*beta
, and reuse it formu[2]
(or preferrably usepoisson_log_lpmf
and avoid storingmu
at all). - Provide good inits
Actually with the latest Stan additions, you just might be able to push your model to feasible region with all the datapoints, using map_rect
for the likelihood loop (with MPI and/or threading) or offloading the x * beta
multiplication to GPU (since 2.19, not yet in rstan). I however fear that this might be non-trivial to get working as the documentation is not quite there yet.
I don’t understand how can I considered it.
poisson_lpmf(y[i] | mu[1])
becomes poisson_log_lpmf(y[i] | alpha + x*beta)
Wait, something is off I think. Should the original model not be
poisson_lpmf(y[i] | mu[1, i])
and the new one
poisson_log_lpmf(y[i] | alpha + x[i] * beta)
?
I am surprised that the original model worked given the mismatch in dimensions between y[i]
and mu[1]
. Are you recovering the parameters if you start from simulated data?
the original likelihood
// Likelihood
for (i in 1:N)
target += log_mix(pi[i],
poisson_lpmf(y[i] | mu[1]) - log1m_exp(poisson_lpmf(0 | mu[1])),
poisson_lpmf(y[i] | mu[2]) - log1m_exp(poisson_lpmf(0 | mu[2])));
the dimension of y[i] and mu[1]
is similalr, [N]
and as you said the
poisson_lpmf(y[i] | mu[1])
becomes poisson_log_lpmf(y[i] | alpha + x*beta)
what about log1m_exp(poisson_lpmf(0 | mu[1])
part?
Will it have log1m_exp(poisson_log_lpmf(0 | alpha + x*beta)
form?
That is not correct I think. y
is of length N but y[i]
is of length 1.
I understand you concern. I want to say y
and mu
have the same dimension. But in my understand, you are write that how poisson_lpmf(y[i] | mu[1])
was working.
now will use
for (i in 1:N)
target += log_mix(pi[i],
poisson_log_lpmf(y[i] | alpha + x*beta) - log1m_exp(poisson_log_lpmf(0 | alpha + x*beta)),
poisson_log_lpmf(y[i] | alpha + x*beta+ci) - log1m_exp(poisson_log_lpmf(0 | alpha + x*beta+ci)));
therefore, does your concern will have an effect in this analysis? I think the above expression is similar to what you stated before, poisson_lpmf(y[i] | mu[1, i])
.
I have also one question. Is there any options that can express mu[1] < mu[2]
?
FYI- In my case I used mu[2]=mu[1]+ci
.
I think you are fine with this approach. For more complicated models, you could use the ordered
type.
I think what you need is (poisson_lpmf(y[i] | alpha + x[i] * beta)
otherwise you are using all data x
for every single outcome y[i]
. Maybe that is the goal in your analysis but it implies that you have the same expected value for each y[i]
.
I don’t understand this statement.
Let me justify the model. It is the mixture of truncated Poisson (it was zero truncated, but now I changed it to one truncated). Each observation will have probability pi[1]
to be in the first group and pi[2]
in the second group. Then, after I got pi
from the posterior,I will categorize each observation according to pi's
value (eg- below 0.6,will categorize in group 1 and so on).
As I mentioned above the model is change to 1 truncated. Therefore can i write it like this?
for (i in 1:N)
target += log_mix(pi[i],
poisson_lpmf(y[i]|mu[1,i])-log1m_exp(poisson_lpmf(0|mu[1,i]))-log1m_exp(poisson_lpmf(1|mu[1,i])),
poisson_lpmf(y[i]|mu[2,i])-log1m_exp(poisson_lpmf(0|mu[2,i]))-log1m_exp(poisson_lpmf(1|mu[2,i])));
I think this should work for well behaved data.
I would also use poisson_log_lpmf instead of poisson_lpmf, tighter priors, and define mu in the model block. You could still run into the memory issues that Martin explained above.
would you please show me how can I write in my case?
data {
int<lower=0> N; // number of data points
int <lower=1> y[N]; // observations
matrix[N,4] x; // Covariates
vector[N] z; //shape parameter
real<lower=0> Az; //shape parameter
}
parameters {
real<lower=0> ci; // any Positive number added on the mu[1] to get mu[2]
real<lower=0> d; // will have uniform distribution that balances the shape parameters
vector[4] beta;
real alpha;
vector<lower=0, upper=1>[N] pi; // mixing proportions
}
transformed parameters{
}
model {
vector<lower=0>[N] mu[2]; // locations of mixture components
mu[1] = alpha + x*beta;
mu[2] = mu[1] + ci;
// Priors
beta ~ normal(0, 2.5); // adapt to your specific knowledge
alpha ~ normal(0,2.5);
d ~ normal(0,5);
ci ~ normal(0, 2.5);
pi ~ beta(z*d,Az*d); // a beta(100, 100) places practically all probability mass between .4 and .6
// Likelihood
for (i in 1:N)
target += log_mix(pi[i],
poisson_log_lpmf(y[i] | mu[1, i]) - log1m_exp(poisson_log_lpmf(0 | mu[1, i])),
poisson_log_lpmf(y[i] | mu[2, i]) - log1m_exp(poisson_log_lpmf(0 | mu[2, i])));
}
I haven’t tested it but this is a sensible implementation if I understand you correctly.
Thank you @stijn for your constructive comments. The command is working now. Let me show you some values of Y
and mu[1]
. Do you think that the estimated Poisson rate parameter is health (seems underestimated?)?
y mu[1]
1 104 9.17
2 83 10.77
3 78 9.07
4 60 7.39
5 45 4.82
6 45 8.69
7 41 4.15
8 2 0.70
9 2 0.73
10 2 0.41
11 2 0.84
12 2 0.62
13 2 0.96
14 2 0.73
15 2 0.74
16 2 1.22
Therefore, would you please give me some suggestions over it.
Is it ok if I consider truncated at 1, (y>1)
for (i in 1:N)
target += log_mix(pi[i],
poisson_lpmf(y[i]|mu[1,i])-log1m_exp(poisson_lpmf(0|mu[1,i]))
-log1m_exp(poisson_lpmf(1|mu[1,i])),
poisson_lpmf(y[i]|mu[2,i])-log1m_exp(poisson_lpmf(0|mu[2,i]))
-log1m_exp(poisson_lpmf(1|mu[2,i])));
and please look my dread on the estimated parameter here.