Mathematical notations for modeling pre-post changes


First time posting, but I’ve been visiting this forum fairly regularly.

R has been my primary programming language, but I have been wanting to learn generative modeling using rethinking and Stan.

I was hoping if anyone can assist me with the mathematical notations needed to estimate the simple changes from pre- to post-time points.

For your reference, I am posting a sample script that I ran using R:

df<- data.frame( 
   ID <- c(1,1, 2,2,3,3,4,4,5,5,6,6), #ID
   GRP <- c(1,1,1,1,2,2,1,1,2,2,2,2),  
   BMI <- c(30.4, 29.9, 41.6, 38.5, 43.4, 43.5, 36.4, 34.3, 32.3, 31.6, 40.1, 37.5),
   BMI_BS<- c(30.4, 30.4, 41.6, 41.6, 43.4, 43.4, 36.4, 36.4, 32.3, 32.3, 40.1, 40.1), #baseline
   TIMEPTS<- c(1,2, 1,2, 1,2, 1,2, 1,2, 1,2)) #1-baseline; 2-post-treatment

df$TIMEPTS<- factor (df$TIMEPTS)
df$GRP <- factor (df$GRP)


fit.1<- lme (BMI ~ GRP: TIMEPTS + GRP + BMI_BS, 
                     correlation = corCAR1 (form =~ 1 |ID),
                     random = list (ID=~1),
                     method="REML",  data= df)

To convert the formula above into Stan, I attempted to write them into mathematical notations; however, I got stuck on what to write for interaction of two dichotomous categorical variable.


BMI_i ~ \beta_0 +\beta_1*GRP + b_1GRP*TIMEPTS + b_2*BMIBS + \sigma^2_e)


\beta_o ~ Normal (25,45)
\beta_1 ~ normal (1,2)
b_1 ~ ??
b_2 ~ normal (25,45)
\sigma_e ~ halfcauchy (0,0.5)??

Once I have these notations set, I can use rethinking’s Ulam function to fit the model and compare the results.

Any guidance would be much appreciated.

Hi and welcome,

sorry if I can’t give a more useful answer, but I’ll try to do my best and with the question bumped up in the list maybe someone more familiar with R and the lme notation can be more helpful.

If I’m not mistaken, brms and rstanarm are supposed to keep the syntax of popular R packages that perform equivalent nonbayesian inference; unfortunately I am not a frequent user of any of those, or R language. if you haven’t tried it yet, maybe that’s the way to go to make your analysis work.

More broadly, I’d say it’s good practice to be able to write a proper mathematical description (in terms of an equation or a design matrix for linear models, for instance) so it’s universally clear what your model is doing (if I am reviewing a paper and it’s not there I will almost certainly require a formal description of the model). With that it should be straightforward to write the model in Stan language. Again unfortunately I can’t quickly reverse engineer the math from the lme code without looking at the function documentation, but maybe someone who can translate that instantly will help.

1 Like

The interaction for two dichotomous variables is simply a dummy variable that is 1 when both of the comprising variables are 1