Student-t vs. normal distributed errors in Bayes linear regression model

eddiemch · January 24, 2019, 4:29pm

All,

I am attempting to reproduce examples in Edward Greenberg’s book, “Bayesian Econometrics” with Stan. In Chapter 8 of the book, Greenberg estimates a normal linear regression model of a child’s birth weight on a host of covariates: gestation length, parity, mother’s age, mother’s height, mother’s pregnancy weight, and whether or not the mother smokes. The last parameter is what I would like to produce as in Greenberg.

I have attached the data, and R code to copy and paste into a list format is:

Clean and prep data, as in Greenberg:

babies_2 = data.frame(“babiesII.csv”) %>%
filter( gestation != “999” & smoke != “9” ) %>%
mutate( bwt_c = scale(bwt) ,
age_c = scale(age) ,
height_c = scale(height) ,
weight_c = scale(weight) ,
gestation_c = scale(gestation) )

Data to send to Stan in list format:

data_babies = with(babies_2 ,
list( N = nrow(babies_2) ,
bwt = as.vector(bwt_c) ,
gestation = as.vector(gestation_c) ,
parity = parity ,
age = as.vector(age_c) ,
height = as.vector(height_c) ,
weight = as.vector(weight_c) ,
smoke = smoke ) )

I have centered continuous variables to expedite estimation.

The code below effectively reproduces Greenberg’s results for a linear regression model with normally distributed errors:


data{
    int<lower=1> N;
    real bwt[N];
    real gestation[N];
    int<lower=0,upper=1> parity[N];
    real age[N];
    real height[N];
    real weight[N];
    int<lower=0,upper=1> smoke[N];
}

parameters{
    real bgest;
    real bpar;
    real bage;
    real bw;
    real bh;
    real bs;
    real b0;
    real<lower=0> sigma;
}

model{
    vector[N] mu;

    b0 ~ normal( 0 , 1.1 );
    bs ~ normal( -0.5 , 0.55 );
    bh ~ normal( 0 , 0.55 );
    bw ~ normal( 0 , 0.55 );
    bage ~ normal( 0 , 0.55 );
    bpar ~ normal( 0 , 0.55 );
    bgest ~ normal( 0 , 0.55 );
    sigma ~ inv_gamma( 6 , 100 );

    for ( i in 1:N ) {
        
        mu[i] = b0 + bgest * gestation[i] + bpar * parity[i] + bage * age[i] + bw * 
        weight[i] + bh * height[i] + bs * smoke[i];
    }
    
    bwt ~ normal( mu , sigma );
}

The result for the “bs” parameter on mother’s smoking is roughly the same as Greenberg’s, although Greenberg gets a mean of -8.9 and I get roughly -8.4.

However, when I change the likelihood to a student-t distribution with 5 degrees of freedom, as Greenberg suggests for re-estimating the model, my result for “bs” looks similar to the normal model (I get -8.3) but Greenberg’s result is -4.656 as the mean for his “bs” parameter on the mother’s smoking variable. Please also note that Greenberg is using the “MCMCpack” package to estimate his models.

I re-estimated the Stan code identical to that above, except I changed the likelihood to a student-t distribution with nu = 5 as in the following code:


model{
    vector[N] mu;

    b0 ~ normal( 0 , 1.1 );
    bs ~ normal( -0.5 , 0.55 );
    bh ~ normal( 0 , 0.55 );
    bw ~ normal( 0 , 0.55 );
    bage ~ normal( 0 , 0.55 );
    bpar ~ normal( 0 , 0.55 );
    bgest ~ normal( 0 , 0.55 );
    sigma ~ inv_gamma( 6 , 100 );

    for ( i in 1:N ) {
        
        mu[i] = b0 + bgest * gestation[i] + bpar * parity[i] + bage * age[i] + bw * 
        weight[i] + bh * height[i] + bs * smoke[i];
    }
    
    bwt ~ student_t( 5 , mu , sigma );
}

Am I coding this model correctly?

babiesII.csv (27.6 KB)

Best,
Eddie

bgoodri · January 24, 2019, 6:30pm

I think so. That inverse gamma prior can do strange things.

eddiemch · January 24, 2019, 11:51pm

I tried re-estimating the above model using with the following:


sigma ~ cauchy(0 , 2)

The results were nearly identical to mine from using the inverse gamma that Greenberg employed. Not sure why my results are so different from Greenberg’s.

Topic		Replies	Views
Bayesian standard error for an estimate of a mean Modeling	6	5602	January 30, 2018
Changing prior from Gaussian to Student's t Modeling specification	12	918	October 24, 2019
Scale-mixture of normal vs. student t Modeling fitting-issues	10	1728	February 22, 2019
Unexpected behavior when applying normal error to Inverse-Gamma distribution Modeling specification	4	559	August 11, 2021
Reproduce results of bayesglm with stan_glm Modeling arm	4	1232	July 2, 2021

Student-t vs. normal distributed errors in Bayes linear regression model

Related topics