The model:
Consider the vector \vec{v}=(v_x,v_y), which has the normalized form of \hat{v}=\frac{1}{\sqrt{v_x^2+v_y^2}}(v_x,v_y). Now, consider a situation in which you have N=300 of such vectors, with the i^{th} vector computed in the following way:
v_x^{(i)}=\mathcal{N}\left(\alpha C^{(i,x)}_{alpha}+\beta C^{(i,x)}_{beta},\sigma \right) \tag{1}
v_y^{(i)}=\mathcal{N}\left(\alpha C^{(i,y)}_{alpha}+\beta C^{(i,y)}_{beta},\sigma \right), \tag{2}
where C^{(i,x)}_{alpha}, C^{(i,x)}_{beta}, C^{(i,y)}_{alpha}, and C^{(i,y)}_{beta} are precomputed coefficients that depend on the component of the vector as well as the vector we are considering–hence the i. Moreover, \alpha, \beta, and \sigma are the two internal parameters I wish to infer from Stan.
To recap, the Stan model will take in all the C coefficients above as four distinct vector
s of size N, as well as the actual data v_x and v_y in the form of vector
s of size N. Its job is to infer the parameters \alpha, \beta and \sigma. For testing purposes, I am feeding Stan data generated with \alpha=0.3, \beta=1.5, and \sigma=0.1.
The issue:
When I feed Stan data generated precisely according to Eqs. (1) and (2), everything runs perfectly fine. Stan converges beautifully on the set internal and noise parameters. When tackling the real problem, however, I will be dealing with normalized vectors coming in as input data. In my test scenario, this would translate to using Eqs. (1) and (2), computing \hat{v}^{(i)} from it, and feed that to Stan. In this case, Stan needs to infer parameters from heterogeneously normalized data–noting that each vector has a different normalization factor.
I’ve tried two different approaches here in my Stan code:

Compute the mean values in Stan from C coefficients according to Eqs. (1) and (2), then compute the magnitude, and normalize them. Then, apply the normal distribution on it. I have ensured that the incoming data too will have its noise added after normalization. This approach results in almost all the iterations raising the message the Metropolis proposal is getting rejected because in normal_lpdf, the scale parameters is negative, instead of >0. I know what this means, but I cannot trace how or why I’m causing it.

Introduce a third parameter–the normalization parameter–which will be a
vector
of size N. The idea here is that Stan will compute the nonnormalized vectors, and in trying to infer \alpha and \beta, it will also infer the appropriate normalization for that vector. To my surprise, this method produces garbage, although it runs just fine with \mathcal{O}(r_{hat}) \sim 1.
I seek:
 To understand how and why my modeling is causing the issues discussed in Points #1 and #2.
 How can I reparametrize the model to be able to carry out this inference on heterogeneously normalized data?
My code:
If you end up running things, Point #1 code should be commented out when running Point #2, and vise versa.
data {
int N; // number of examples in total (300)
vector[N] vx;
vector[N] vy;
vector[N] alpha_coeff_x;
vector[N] beta_coeff_x;
vector[N] alpha_coeff_y;
vector[N] beta_coeff_y;
}
transformed data {
}
parameters {
real alpha;
real beta;
real norms[N]; // used in Point #2
real sigma;
}
model {
//true normalized values
vector[N] vx_true;
vector[N] vy_true;
real norm = 0.; // used in Point #1
//Point #1: skip Point #2
for (i in 1:N) {
vx_true[i] = alpha * alpha_coeff_x[i] + beta * beta_coeff_x[i];
vy_true[i] = alpha * alpha_coeff_y[i] + beta * beta_coeff_y[i];
norm = sqrt(vx_true[i]*vx_true[i] + vy_true[i]*vy_true[i]);
vx_true[i] /= norm;
vy_true[i] /= norm;
}
//Point #2: skip Point #1
for (i in 1:N) {
vx_true[i] = norms[i]*(alpha * alpha_coeff_x[i] + beta * beta_coeff_x[i]);
vy_true[i] = norms[i]*(alpha * alpha_coeff_y[i] + beta * beta_coeff_y[i]);
}
vx ~ normal(vx_true,sigma);
vy ~ normal(vy_true,sigma);
}
generated quantities {
}