Declaring constrained parameters

bhomass · August 10, 2017, 6:47am

What is the proper way to declare limits and its mean and variances. If I declare these constraints

real<lower=0,upper=1> x;

the sample from a gaussian
x ~ normal(0, 1); <- should this be centered around 0.5 instead like x ~ normal(0.5, 0.5)

jjramsey · August 10, 2017, 12:54pm

To be at least consistent, if you’ve specified parameter x as

real<lower=0,upper=1> x

then if you insist on using a normal distribution as a prior, then it should be a truncated normal like this, following page 82 of the Stan language manual:

x ~ normal(mu, sigma)T[0,1]

where mu and sigma are whatever suits the problem that you are trying to solve. Now you could set mu to zero and sigma to 1, but I doubt that’s a good idea, since such a truncated distribution would have far different behavior from an untruncated distribution with the same values of mu and sigma. Indeed, it’s not clear to me what you are trying to accomplish here.

dlakelan · August 10, 2017, 1:40pm

When the prior has constants for parameters as in normal(0,1) or whatever the truncated distribution and the un-truncated distribution are equivalent logically, because whatever the renormalization factor is, it is also constant.

but if you have parameters mu,sigma and you use normal(mu,sigma), then you need the truncation syntax or some other correction, because as mu,sigma change the renormalization factor changes as well.

bhomass · August 10, 2017, 6:20pm

Thanks for the explanations. Here is what I am doing. I generated some simulated data for which I know the parameters exactly. I am creating a model to recover these parameters from the data so I know the model is correct.

One of the parameters has mean of 0.6 with variance of 0.09. If I plugin p ~normal(0.6, 0.09), I am pretty sure I would get back a good estimate of the parameter value. However, when applying the model to experimental data, I would not know where to center the normal or know the variance and truncation limits. Setting wide limits seems to have devastating effects on parameter estimation. It is no where close to the input setting even after 2000 iterations. What would an experienced person do. Does it come down to best guess still and run very long iterations?

jjramsey · August 10, 2017, 6:40pm

First off, as a general rule, don’t use hard limits unless you know for sure that your parameter won’t exceed them; e.g. only use a lower bound of zero if you know your prior cannot be negative.

Second, I suspect that setting wide limits isn’t so much the cause of your problems with parameter estimation as it is smoking out other problems in your model and possibly your handling of data.

It would help if you could post your full Stan file, so we could have a better sense of what is wrong.

bhomass · August 12, 2017, 6:55am

here is a small section of my whole model, so its easy to explain what I want to do.

functions {
    real logitM(row_vector z, real alpha_tilda, vector gamma_array) {
		real expTerm;
		real m;
    	expTerm = exp(alpha_tilda + (z * gamma_array)); 
    	m = expTerm / (1 + expTerm);
    	return m;
  	}

 	/*
 	**	single time period, single user
 	*/    
	real phi(int C, real alpha_tilda, vector gamma_array, row_vector z) {
		int numberStates = 1;
    	real phiReal = 0;		
    	
  		real m = logitM(z, alpha_tilda, gamma_array);
  		phiReal = ( (C == 1)? m : (1 - m)) ;
      		
		return phiReal;
	}

	real iLogLikelihood_lpmf(int[] C_array, int T, real alpha_tilda, vector gamma_array, row_vector[] z_array){	
		real L_i;
		
		real middleReal; 		
		middleReal = phi(C_array[1], alpha_tilda, gamma_array, z_array[1]) ; //* q_array(0)s

		for(t in 2: T) {
			middleReal = phi(C_array[t], alpha_tilda, gamma_array, z_array[t]) * middleReal;
			if (C_array[t] == 1) break;		// converted. no need to go on
		}
		L_i = log(middleReal);
		return L_i;
	}
}
data {
	int N; // number of users
	int T; // number of days
	int K; // number of predictors. 
	
	int y[N, T]; // outcome
	row_vector[K] z[N, T]; // predictors 

	real alpha_tilda;
}
parameters {
	vector<lower=0,upper=1>[K] gamma;
}

model {
	for(n in 1:N) {
		y[n] ~ iLogLikelihood(T, alpha_tilda, gamma, z[n]);
	}
}

basically, the logit consists the coeffients gamma applied to input events z and added to alpha_tilda to. It is used derive probability of conversion event y.

This model takes 2 hours to run 6000 samples, which by itself is ok, but after I fill in remaining parts, it takes days to run. I would like to get advice whether this can be made to run more efficiently. Once I figure out how to speed this part up, I may be able to speed up the entire model.

bhomass · August 12, 2017, 8:33pm

A follow up question. This model works very well for higher parameter values where the conversion prob is relatively high. But for low conversion prob, the resolved parameters hover around 0.n, when the true values are around 0.00n. Is there any parameterization tricks to get accurate results in that case, besides feeding a huge size sample set which is not possible to collect in real life.

bhomass · August 13, 2017, 11:24pm

let me narrow down my question a bit more. Basically I am deriving the beta parameters in a logistic regression.
Exp(alpha + gamma * z) / 1+ exp(alpha + gamma * z)

Gamma is a vector w a small mean and variance.

My question is how best to declare the gamma variable. Ideally I would use a Gaussian prior, but I would not know what to put down as it’s mean. So instead I am using the default uniform prior. Would a Gaussian prior have converged faster? If so how would I declare a Gaussian prior wo knowing the mean?

The posted code has a time series added to the logistic regression, but the basic question is still on gamma parameters themselves.

bhomass · August 13, 2017, 11:57pm

This a good point to dive into a deeper discussion. I could give up on the constraints, and relie on a small sigma. However what would be a mu that is “appropriet for my problem” when the parameters are coefficients in a logistic regression? Mu is the very thing I am solving for.

Bob_Carpenter · August 14, 2017, 3:46pm

There’s an entire chapter of the manual devoted to regression that has examples. The usual assumption is zero centered.

bhomass · August 15, 2017, 5:35am

Thanks for pointing out that particular chapter. I read it and it makes sense. In the linear regression section, it discussed the use of QR reparameterization. It seems to me, I could apply that to my logistic regression model as well. Am I right on that?

Bob_Carpenter · August 15, 2017, 7:30pm

Yes.

bhomass · August 17, 2017, 9:32pm

ok I have a question following the line of thought from QR reparametrization and vectorization. In order to implement QR, I am converting the variables from

row_vector[K] x[N, T] ;
int y[N, 2, T];   to

matrix[N, K] x[T];     // there is a T dimenion, because I have a time series problem
vector[N] y[T];

The original loop
for(n in 1:N) {
y[n] ~ iLogLikelihood(alpha_tilda, z, gamma, N, T);
}
to
y ~ iLogLikelihoodf(alpha_tilda, z, gamma, N, T);

However, now I don’t know how to write out the inside of iLogLikelihood_lpdf. I tried write a lpdf function that returns a vector[N], but is rejected. But I don’t know how to convert the vector output that I get into a scalar before returning. Do I arbitrarily insert a sigma and force a normal distribution on y like the std_normal_lpdf example in the manual?

jonah · August 17, 2017, 11:06pm

I haven’t been following the whole thread, but I think this is probably because you need to return the sum of the elements in your vector, rather than the vector itself.

bhomass · August 17, 2017, 11:24pm

I can see the logic of using the product of the elements, but certainly don’t understand about using the sum.

jonah · August 17, 2017, 11:43pm

the total log likelihood is the sum of the individual log likelihoods. If
we weren’t working on the log scale it would be a product.

bhomass · August 18, 2017, 6:26am

ah yes, my mistake. I am thinking product of likelihood, or sum of log likelihood.

T_nick · March 7, 2019, 11:19am

Hello @Bob_Carpenter,

I need to use the value of a parameter as a constrain to a new parameter.
e.g
real<lower=0, upper=1> p[N];
real<lower=-p[N]> delta[N];

Is there a way to do so???

Many thanks

Bob_Carpenter · September 13, 2019, 1:57pm

I missed this in the general category. No, there’s not an easy way to do it automatically, but I explain how to do it in the user’s guide. You have to apply the Jacobian yourself.

We hope to extend the language soon to cover vectors of constraints like this.

Topic		Replies	Views
Effect of declaring truncated distributions for parameters which already have a range declared Modeling specification	3	696	April 26, 2022
Half-normal, Half-Cauchy and Half-t Modeling	8	12440	October 3, 2020
Restricting parameters based on data Modeling	2	695	December 10, 2018
Constraint checking and resulting distribution (truncated normal) Modeling truncated	1	941	December 8, 2021
Sampling from truncated normal distribution Modeling fitting-issues , specification	5	828	October 11, 2021

Declaring constrained parameters

Related topics