# Vector bounds and sampling error

Hello,

I’m trying to estimate some parameters from the data, but I came across a bug that I don’t know how to deal with.
This piece of code alone may look stupid (especially, using this multiplier), but it’s an isolated part of a more complicated model, that has the same issue. For some reason, estimation of the transformed variable `R` fails because of the lower bound I set to it. `R` values shouldn’t get as low as 0, but in case they happen to be negative, I want to discard them by setting this lower bound.

``````R_data = c(0.22, 0.23, 0.26, 0.27, 0.3, 0.34, 0.13, 0.14, 0.21, 0.23,
0.29, 0.31, 0.18, 0.19, 0.27, 0.27, 0.33, 0.36, 0.31, 0.35, 0.38,
0.39, 0.41, 0.48)

N = length(R_data)

modelstring = "
data {
int N;
vector[N] R_data;
}
parameters {
vector[N] R_multiplier;
}
transformed parameters {
vector<lower=0>[N] R;
R = R_multiplier .* R_data;
}
model {
R_multiplier ~ normal(1, 0.25);
}
"

m = stan(model_code = modelstring2, data = list(N=N, R_data=R_data), seed = 1234)
``````

And this is the error I’m getting.

``````...
Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: validate transformed params: R is -0.41672, but must be greater than or equal to 0  (in 'model8e4e69b72ec7_3d89a0a08343181c3029026d11b0b9ba' at line 10)

Chain 1:
Chain 1: Initialization between (-2, 2) failed after 100 attempts.
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
 "Error in sampler\$call_sampler(args_list[[i]]) : Initialization failed."
error occurred during calling the sampler; sampling not done
``````

which is unexpected, since if I remove the lower bound (change `vector<lower=0>[N] R;` to `vector[N] R;` in the `transformed parameters` section), I get the values for R that almost never gets lower than 0 (well, maybe some initial values can get down to -2, but not in 100 attempts in a row).

Could you please explain me why is it happening?

Evgeny

I guess the problem is that at initialisation all the `R_multiplier` elements are initalised between -2 and 2. Since there are 25 elements, you have a probability of 1- .5^25 that there is at least 1 negative one which throws the error (because all elements in `R_data` are positive). If you know that `R_data` is always positive you could enforce that in the data block and then to the same in the parameter block for `R_multiplier`, this will ensure that `R` is positive.

1 Like

I think the main issue is that you have the lower-bound constraint on the wrong parameter. By having the bound on `R`, the model is free to sample negative values of `R_multiplier` which will then subsequently be rejected in the assignment to `R` and cause the model to stumble.

If you instead put the lower bound on `R_multiplier`, the model will only sample non-negative values and values of `R` will not fall below 0. This will give the same result (only positive values of `R`), but in a more efficient way (as it does not risk sampling a negative value, getting rejected, and sampling again).

1 Like

Thank you,
Well, I can’t put a lower bound on `R_multiplier` since it’s gonna limit some of possible `R` values.
But I’ve got the point, I severely underestimated the probability of getting at least one negative in a vector of inits. I should rather specify a vector of customized init values.

Well, I can’t put a lower bound on `R_multiplier` since it’s gonna limit some of possible `R` values

Not really, since any negative `R_multiplier` values are just going to be rejected and then re-sampled until a positive value is returned. Unless you’re planning on passing negative values in `R_data`?

2 Likes

You’re right, thank you!