There is an interesting blog post from Uber engineering showing how to vectorize a Pyro model using `pyro.plate` and `pyro.mask`.

An non-optimized Stan version of the same model is given below.

In the Pyro version, `pyro.mask` allows each branch of the conditional in the `for` loop to be vectorized. Is there an equivalent approach in Stan?

Cheers,

Micky

``````functions {
vector softplus(vector x) {
return log1p(exp(x));
}
}

data {
int<lower=0> N;
vector[N] x;
vector[N] y_obs;
int<lower=0,upper=1> censored[N];
}

parameters {
real alpha;
real beta;
}

transformed parameters {
vector[N] scale;
scale = softplus(alpha * x + beta);
}

model {
alpha ~ normal(0,10);
beta ~ normal(0,10);

for(i in 1:N) {
if(censored[i]) {
censored[i] ~ bernoulli_lpmf(1.0 - exponential_cdf(y_obs[i],1/ scale[i]));
} else {
y_obs[i] ~ exponential(1/ scale[i]);
}
}
}

``````
1 Like

To answer my own question, you can achieve this by manually transforming the data into two vectors for censored and uncensored observations.

``````functions {
vector softplus(vector x) {
return log1p(exp(x));
}
}

data {
int<lower=0> N;
vector[N] x;
vector[N] y_obs;
int<lower=0,upper=1> censored[N];
}

transformed data {
int<lower=0> N_censored = sum(censored);
int<lower=0> N_uncensored = N - N_censored;

int<lower=1,upper=N_censored+1> c;
int<lower=1,upper=N_uncensored+1> u;

vector[N_censored] x_censored;
vector[N_censored] y_censored;
vector[N_uncensored] x_uncensored;
vector[N_uncensored] y_uncensored;

u = 1;
c = 1;
for(i in 1:N){
if(censored[i]){
x_censored[c] = x[i];
y_censored[c] = y_obs[i];
c += 1;
} else {
x_uncensored[u] = x[i];
y_uncensored[u] = y_obs[i];
u += 1;
}
}
}

parameters {
real alpha;
real beta;
}

transformed parameters {
vector[N_uncensored] scale_uncensored;
vector[N_censored] scale_censored;
vector[N_censored] q;

scale_uncensored = softplus(alpha * x_uncensored + beta);
scale_censored = softplus(alpha * x_censored + beta);

for(i in 1:N_censored){
q[i] = exponential_cdf(y_censored[i], 1 / scale_censored[i]);
}
}

model {
alpha ~ normal(0,10);
beta ~ normal(0,10);
1 ~ bernoulli(1 - q);
y_uncensored ~ exponential(1 ./ scale_uncensored);
}

``````

Note that I’m still looping in the `transformed data` block to populate `q`. I’m getting a dimension mismatch error when I the the following:

``````q = exponential_cdf(y_censored, rate_censored);
``````

`Dimension mismatch in assignment; variable name = q, type = vector; right hand side type = real.`

The functions reference indicates that cdf functions support vectorization so I’m clearly doing something wrong - any ideas?

Cheers,

Micky

1 Like

The message is telling you that `exponential_cdf` returns a single number, even if the parameters are vectors. The single number is the multiple of all `_cdf` calls for each of the vector elements. I am not sure why that is or what it is useful for. For the `_lpdf` functions the vectorized versions return a single number that is the sum of the log probabilities over the vector, which makes sense, so maybe this is by analogy?