How to calculate log posterior probability in transformed parameters block

I am currently working on a state-space model that takes into account spatial structure. For vector r[k], I have built into the model the assumption that adjacent r’s are similar to each other.
I would like to compute the log posterior probabilities for the response variables Y and r, respectively. Based on my colleague’s advice, I have written the code as follows:

data {
  int<lower=0> N_data;
  int<lower=0> N_region;
  int<lower=0> N_subject;
  vector[N_data] Y;
  vector[N_data] Age;
  vector[N_data] Ses;
  vector[N_data] Sex;
  vector[N_data] Smoke;
  int RID[N_data];
  int SID[N_data];
  int I;
  int<lower=1, upper=N_region> From[I];
  int<lower=1, upper=N_region> To[I];

parameters {
  vector[N_region] r;
  vector[N_region] beta1;
  vector[N_region] beta2;
  vector[N_region] beta3;
  vector[N_region] beta4;
  vector[N_subject] beta_ind;

  real<lower=0> s_r;
  real<lower=0> s_Y;
  real<lower=0> s_beta1;
  real<lower=0> s_beta2;
  real<lower=0> s_beta3;
  real<lower=0> s_beta4;
  real<lower=0> s_beta_ind;

transformed parameters {
  vector[N_region] lp_r;
  vector[N_data] lp_Y;
  for (i in 1:N_region){
    lp_r[i] = log(1.0/N_region) + normal_lpdf(r[To[i]]|r[From[i]], s_r);
  for (i in 1:N_data){
    lp_Y[i] = log(1.0/N_data) + student_t_lpdf(Y[i]|3, r[RID[i]]+beta1[RID[i]]*Age[i]+beta2[RID[i]]*Ses[i]+beta3[RID[i]]*Sex[i]+beta4[RID[i]]*Smoke[i]+beta_ind[SID[i]], s_Y);

model {
  target += normal_lpdf(r[To]|r[From], s_r);
  for (n in 1:N_data){
    Y[n] ~ student_t(3, r[RID[n]]+beta1[RID[n]]*Age[n]+beta2[RID[n]]*Ses[n]+beta3[RID[n]]*Sex[n]+beta4[RID[n]]*Smoke[n]+beta_ind[SID[n]], s_Y);
  beta1 ~ student_t(3, 0, s_beta1);
  beta2 ~ student_t(3, 0, s_beta2);
  beta3 ~ student_t(3, 0, s_beta3);
  beta4 ~ student_t(3, 0, s_beta4);
  beta_ind ~ normal(0, s_beta_ind);

  s_Y ~ normal(0, 0.01);

However, this code does not seem correct. First, if r[k] is a categorical variable satisfying r\in\{ 1,2,..., N_{region}\}, then the term log(1.0/N_region) would be necessary, but actually r is a continuous value. Second, if there is no prior distribution set for any parameter, then the log posterior probability is equal to the log-likelihood, which I think can be expressed by a function with _lpdf at the end, but we have prior distributions for beta1, beta2, beta3, beta4 and beta_ind. How should I consider these prior distributions?

Any minor comments are welcome. Thank you in advance!

Hi, @ykogr. It’s very hard to read a model and judge a user’s intent. you’ve

I think there may be some confusion here. These are declared as data. The posterior is a distribution over your parameters.

The priors look OK the way they are. The prior is just part of the joint density, which is proportional to the posterior by Bayes’s rule, so everything just goes in together.

I’m unclear on why you’re calculating lp_r and lp_Y variables in the transformed parameters block and then you don’t use them. You can just cut out that whole block and the behavior of your Stan program won’t change. Or you could replace the first five lines of your model block with

target += lp_r;
target += lp_Y;

to use the values you compute here.

I’m not sure what the log(1 / N_data) (equivalently -log(N_data)) terms are doing. They’re not needed to model the probabilities of the r or Y, which are already given well formed normal and Student-t models.

You’re going to have trouble fitting with the centered parameterization you’re using. Given this usage

beta1 ~ student_t(3, 0, s_beta1);

I’d recommend declaring as

real<multiplier = s_beta1> beta1;

to convert this to a non-centered parameterization.

This kind of thing can be rough:

  s_Y ~ normal(0, 0.01);

if you don’t custom initialize s_Y to something in scale, because otherwise it will be initialized between exp(-2) and exp(2) by default (-2, 2) on unconstrained scale with an exp transform to satisfy the positivity constraint.

1 Like

@Bob_Carpenter I sincerely thank you for your help in tackling this issue. It is extremely helpful! I apologize for the lack of clarity in my question and the intent of the model.

My model is called a Markov field model, where the formula

target += normal_lpdf(r[To]|r[From], s_r);

is the Markov field (system model), the formula

Y[n] ~ student_t(3, r[RID[n]]+beta1[RID[n]]*Age[n]+beta2[RID[n]]*Ses[n]+beta3[RID[n]]*Sex[n]+beta4[RID[n]]*Smoke[n]+beta_ind[SID[n]], s_Y);

is the observation model.

In a Markov field model or state space model, the log posterior probability of the entire model is determined by the balance between the log posterior probability of the system model and the log posterior probability of the observation model.
So I wanted to extract the log posterior probability of the system model and the log posterior probability of the observation model, respectively, and check if they are balanced.

If I understand correctly, the log posterior probability is the sum of the log prior and the log likelihood. log(1.0 / N_data) represents the log prior probability and I thought that after normal_lpdf it represents the log likelihood, is this wrong?

Also, I really appreciate your advice on parameter restrictions. In fact, I am struggling with fitting this model, as it will not converge without the following constraints on s_Y:

s_Y ~ normal(0, 0.01);

The value of 0.01 was determined through trial and error.
However, if there is a more scientifically valid and smarter way, I would like to adopt it. Can I solve this problem by specifying the offset and multiplier for beta1, beta2, beta3, beta4 and s_Y?

If anything remains unclear, I would be grateful if you could ask me a question.
Thank you in advance!