Hi Stan Forum, It is my first time here, and I don’t write code in Stan, just code in R with brms, but I can write the code in “stan” chunks if that some how makes it easier for you.

I am wondering if you can just tell me how to reformat my data so I can do the analysis that the reviewer asked me to do–my friend thinks the reviewer suggests that I am supposed to be getting a coefficient for each question from the brms output and I am not yet getting that.

I attach my data environment here along with an abbreviated version of code I have developed for you to run.

To review, the reviewer suggested the following:"An alternative approach to investigate the effects on both general attitudes and specific components would be to use multilevel linear modeling with items nested within participants, including random effects of the term. In R lme4 notation, the formula for this model would look like: response ~ undocumented + (1 | participant) + (undocumented | item). Fixed effects would indicate the general trend, while random effects estimates would show how these effects differ depending on the item used. "

Model seems to match their description.The difficulty for me as always is getting my data into a format that allows me to do this sort of analysis. my friend earlier suggested the following:

First read in the attached mydata2.csv file into R as mydata2Then:1) Step one: recode each item-variable so that it goes in the same (e.g. pro immigrant direction) and has the same number of levels:#this is for the coding where don’t know is 0

```
mydata2$immfavor_proimm<-revalue(mydata2$immfavor, c("-2"="5", "-1"="4", "0"="3", "1"="2", "2"="1"))
mydata2$immrights_proimm<-revalue(mydata2$immrights, c("-2"="5", "-1"="4", "0"="3", "1"="2", "2"="1"))
mydata2$immlang_proimm<-revalue(mydata2$immlang, c("-2"="5", "-1"="4", "0"="3", "1"="2", "2"="1"))
mydata2$immcrime_proimm<-revalue(mydata2$immcrime, c("-2"="5", "-1"="4", "0"="3", "1"="2", "2"="1"))
mydata2$immjob_proimm<-revalue(mydata2$immjob, c("-2"="5", "-1"="4", "0"="3", "1"="2", "2"="1"))
mydata2$immunit_proimm<-revalue(mydata2$immunit, c("2"="5", "1"="4", "0"="3", "-1"="2", "-2"="1"))
mydata2$letin_proimm<-revalue(mydata2$letin, c("2"="5", "1"="4", "0"="3", "-1"="2", "-2"="1"))
mydata2$immideas_proimm<-revalue(mydata2$immideas, c("2"="5", "1"="4", "0"="3", "-1"="2", "-2"="1"))
mydata2$immgrowth_proimm<-revalue(mydata2$immgrowth, c("2"="5", "1"="4", "0"="3", "-1"="2", "-2"="1"))
```

- Step two: make sure all the variables are ordered in the right direction:mydata2$letin_proimm<-ordered(mydata2$letin_proimm, levels = c(“1”, “2”, “3”, “4”, “5”))

```
mydata2$immrights_proimm<-ordered(mydata2$immrights_proimm, levels = c("1", "2", "3", "4", "5"))
mydata2$immfavor_proimm<-ordered(mydata2$immrights_proimm, levels = c("1", "2", "3", "4", "5"))
mydata2$immlang_proimm<-ordered(mydata2$immlang_proimm, levels = c("1", "2", "3", "4", "5"))
mydata2$immcrime_proimm<-ordered(mydata2$immcrime_proimm, levels = c("1", "2", "3", "4", "5"))
mydata2$immideas_proimm<-ordered(mydata2$immideas_proimm, levels = c("1", "2", "3", "4", "5"))
mydata2$immjob_proimm<-ordered(mydata2$immjob_proimm, levels = c("1", "2", "3", "4", "5"))
mydata2$immgrowth_proimm<-ordered(mydata2$immgrowth_proimm, levels = c("1", "2", "3", "4", "5"))
mydata2$immunit_proimm<-ordered(mydata2$immunit_proimm, levels = c("1", "2", "3", "4", "5"))
```

- step three: restructure the data into long format where each row is a response to an individual question/item for a particular participant (with individual questions nested within each participant):

```
library(tidyr)
responses.df <- mydata2%>%
select(turkID = turkID,
undocumented,
letin_proimm,
immrights_proimm,
immfavor_proimm,
immlang_proimm,
immcrime_proimm,
immideas_proimm,
immjob_proimm,
immgrowth_proimm,
immunit_proimm, female, hispanic, white, age, northeast, south, midwest, lnincome, educ, born_usa, democrat, republican, media, polview, auth, ethnocentrism)%>%
distinct(turkID, .keep_all = TRUE)%>%#Headsup don't forget the full inclusion criteria in the writeup.
pivot_longer(c(letin_proimm, immrights_proimm, immfavor_proimm, immlang_proimm, immcrime_proimm, immideas_proimm, immjob_proimm, immgrowth_proimm, immunit_proimm),
names_to = "question")%>%
mutate(ppnt_response = factor(value,
ordered = TRUE,
levels = c(1,2,3,4,5)) )#probably this is the default anyway but nice to be explicit.
responses.df$value<-as.factor(responses.df$value)
```

- run the model with the following code–I believe this is correct if participant would be MTurkID and “item” is supposed to be question–but the output provided only gives the output alone which contains only one coefficient “undocumented” which I believe its composite association on all variables:

```
fit13<- brm(ppnt_response ~
undocumented + (1 | turkID) + (undocumented | question),
data = responses.df, init="0",
family = cumulative("probit"), chains = 10, cores = 2,
control = list(adapt_delta = 0.99))
```

output: Family: cumulative

Links: mu = probit; disc = identity

Formula: value ~ undocumented + undocumented + (1 | turkID) + (undocumented | question)

Data: responses.df (Number of observations: 3582)

Draws: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;

total post-warmup draws = 1000

Group-Level Effects:

~question (Number of levels: 9)

Estimate Est.Error l-95% CI u-95% CI Rhat

sd(Intercept) 0.38 0.13 0.21 0.74 1.00

sd(undocumented) 0.18 0.08 0.04 0.39 1.00

cor(Intercept,undocumented) -0.14 0.38 -0.77 0.65 1.00

Bulk_ESS Tail_ESS

sd(Intercept) 351 696

sd(undocumented) 290 414

cor(Intercept,undocumented) 1126 726

~turkID (Number of levels: 398)

Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

sd(Intercept) 0.68 0.03 0.62 0.74 1.00 456 558

Population-Level Effects:

Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

Intercept[1] -1.14 0.15 -1.45 -0.83 1.00 404 514

Intercept[2] -0.42 0.15 -0.73 -0.12 1.00 400 469

Intercept[3] -0.12 0.15 -0.43 0.18 1.00 400 435

Intercept[4] 0.83 0.15 0.51 1.13 1.00 411 503

undocumented -0.08 0.10 -0.29 0.12 1.00 581 703

Family Specific Parameters:

Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

disc 1.00 0.00 1.00 1.00 NA NA NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS

and Tail_ESS are effective sample size measures, and Rhat is the potential

scale reduction factor on split chains (at convergence, Rhat = 1).

Is this not what the reviewer asked me to do? I thought that what question here is the equivalent of item and ppnt_response is the same as what hte reviewer called response and TurkID is the same as participant, but a Bayesian colleague seems to suggest i should have a different coefficient estimate for every item/question in my model and that is not what I see in my output. What I see from the output is not an estimate for each of the 9 items/questions but only a general estimate for the undocumented variable.If that is what I should expect from this model and everything is fine, then I could move ahead with providing you results and my results based on this minimal output.

if my code is not correct and I need to create other variables from the data by coding to what the reviewer suggests, I would like to know how I do that.

Thank you,

Jacob

datax.csv (73.8 KB)

Reproducible example of data prepration and analysis for BRMS.R (3.2 KB)