# Grouped latent factor and polynomial predictors

Hi BRMS community! I am not sure if this is the right forum for a question like this, so if not please let me know - I’ve been stuck for awhile and my brain is taxed! Any help appreciated.

I have a dataset of measurements from pairs of conversation partners.

• One of the measures is, “How much did you like the person you talked to?” (`i_like_you`, a 1-7 scale)
• Another measure is “How much do you think they liked you?” (`i_think_you_like_me`, also 1-7)
• There’s a term, `you_actually_like_me` which in the real data is really just your partner’s `i_like_you`
• Finally, we have a measure of how much each person enjoyed the conversation (`enjoyment`)

I’ve made a toy dataset in which every `speaker_id` has two conversations, each with a different partner.

``````library(tidyverse)

set.seed(1L)
n <- 40

data <- tibble(
convo_id = 1:n,
speaker_id = rep(LETTERS[1:20], 2),
partner_id = c(LETTERS[11:20], LETTERS[1:10], LETTERS[20:11], LETTERS[10:1]),
i_like_you = sample(1:7, n, replace = TRUE),
i_think_you_like_me = sample(1:7, n, replace = TRUE),
you_actually_like_me = sample(1:7, n, replace = TRUE),
enjoyment = sample(1:7, n, replace = TRUE)
) |>
arrange(speaker_id)
``````

The data look like this:

``````data |> head()
# A tibble: 6 × 7
convo_id speaker_id partner_id i_like_you i_think_you_like_me you_actually_like_me enjoyment
<int> <chr>      <chr>           <int>               <int>                <int>     <int>
1        1 A          K                   4                   5                    4         2
2       21 A          T                   7                   3                    3         7
3        2 B          L                   7                   7                    1         6
4       22 B          S                   7                   3                    7         4
5        3 C          M                   5                   2                    7         5
6       23 C          R                   6                   1                    1         6
``````

*Note: This toy example is not representative in that `i_like_you` and `you_actually_like_me` are not mirrored between conversation partners. But for the sake of the modeling question I don’t think this matters.

My goal is to model the impact of the gap between `i_think_you_like_me` vs `you_actually_like_me`, on enjoyment (let’s call this `liking_gap`).

The challenge I’m running into is that I want to take into account the fact that each person has some latent overall bias towards liking people in general (let’s call this `liking_bias`), as well as some latent bias related to how much they think people like them in general (`perception_bias`).

In other words, in our own conversation, I want to be able to estimate the degree to which `liking_gap` predicts `enjoyment`, above and beyond the degree to which latent `liking_bias` and `perception_bias` influence `enjoyment`.

To complicate things further, it’s likely that the relationship between these predictors and `enjoyment` is curvilinear - there’s some kind of sweet spot (probably quadratic) that corresponds with maximum enjoyment.

I think I should be able to model this as a Bayesian hierarchical model, grouped by individual, with each individual having parameters for `liking_bias` and `perception_bias`, and linear and squared terms for `i_think_you_like_me` and `you_actually_like_me`, plus their interaction (which I think would basically represent `liking_gap` without resorting to computing a difference score).

I started following Scott Claessen’s tutorial on how to model latent variables with BRMS, but I got confused trying to figure out how to translate his syntax, which specifies latent variables with `mi()` and multiple columns, into my structure, which is thinking of multiple observations from the same column as being the emission states of the latent variable.

I think a problem like this would usually be solved with SEM, but after having a look at `blavaan` and reading through this `brms` Github thread on the gradual development of SEM-style support, I got a little overwhelmed and so thought I’d just ask for help.

Thank you again for any help anyone can offer to this `brms` newbie!

Update: I can’t find a way to delete this post, but after thinking about it some more I revised my question and corrected the way I was generating example data, and posted it on CrossValidated instead. If you have any thoughts, please consider sharing there! multiple regression - How to model the true effect of a difference score, minus rater bias - Cross Validated

Have any exposure to Item Response Theory? Seems pretty straightforward in that context, and can indeed be expressed as an SEM but I don’t think that’s necessary to think in that framework. Pseudo-code:

``````
latent I like partner [r,p] = (
likability[p]
+ like_bias[r]
)
I like my partner [r,p] ~ binomial(
latent I like partner [r,p]
)

latent partner actually likes me [r,p] = (
likability[r]
+ like_bias[p]
)
Partner actually likes me[r,p] ~ binomial(
latent partner actually likes me [r,p]
)

latent think partner likes me [r,p] = (
likability[r]
+ like_bias[p]
+ think_like_bias[r]
)

think partner likes me [r,p] ~ binomial(
latent think partner likes me [r,p]
)

Enjoy[r, p] ~
ordinal (
latent I like partner[r,p] * w1[r]
+ latent think partner likes me [r,p] * w2[r]
+ … // interactions go here, recommend building up from simpler models to larger
+ enjoy_bias[r]
, cut points[r]
)
``````

Where the parameter `w1, w2, …` reflect the weight of a given latent in the enjoy decision. Probably should use partial pooling for each of those weight vectors.

Note that in my above formulation, you would not have this difference in there as you express it, because I don’t think makes sense to include information explicitly outside the respondent’s awareness (whether the partner actually likes them). It does make sense to model the partner’s actual liking as an outcome as it informs on other latent parameters that are pertinent to the enjoy equation.

Thank you Mike! I hadn’t thought about IRT here. Will give it a shot, much appreciated.

Oh, note that I hadn’t seen your data snippet before writing my pseudo code , so I missed that all the items were ordinal, so replace all instances of binomial with ordinal and add cut points for each responded (and different cut points for different items).

1 Like