Modeling experimental privacy valuation data with brms

Dear community/all,

me and a colleague are recently working on our dissertation in the area of privacy research. In an 2x2x2 full factorial online experiment we set up an e-commerce-shop and manipulated the degree of convenience (high/low), personalization (high/low) and data sensitivty (high/low). As the dependet variable we measured privacy valuation (via a bidding mechanism) in €, specifically what amount of money participants would demand to disclose the requested data. The value was capped at 20€ since earlier focus groups indicated that this is a sensitive amount for disclosing some data types. Therefore the dependent variable varies from 0-20 and is truncated. Since the participants had to state their privacy valuation on their own, the data destribution looks also a little weird (fixation on values like 5,10,15 and a high conncentration at 20).


We want to use brms package to fit a model considering the three manipulation (degree of conveneince/personalization/data snesitivity) and the group assignment in the experiment. The manipulated variables handled like index variables. 2 was assigend to the high and 1 to the low condition. We already tried several models. The lastest was:

 Modelnew <- brm(WTA | trunc(ub =20) ~ Conven_index * Personaliz_index * Sensit_index + (1|group),
                data = PriValuation,
                family = hurdle_lognormal(),
                control = list(adapt_delta = 0.99,max_treedepth = 13),
                iter = 3000, warmup = 1000, chains = 2, cores = 4)

The fit was rather poor. Since we new to stan/brms we were wondering how to improve our modeling an where hoping for some good clues here. If you need more info on the data please contact us.

Kind regards,
Maik & Jan

I suspect that you’ll need to model the rounding process that your participants are performing to convert their ‘true’ privacy valuation into the monetary amount that they report. Unfortunately, I don’t think rounding like this is possible in brms—you’re probably going to have to implement it in pure Stan.

The Stan User’s Guide has a section on rounding that explores two approaches: marginalization and latent variables. The toy example, however, applies the same rounding rule to all observations. Your case appears to be trickier because the response distribution suggests that participants may be using different rounding rules, both for direction (ceiling/floor/nearest) and precision of the reported value (e.g., 0.01, 0.10, 1.00, or 10.00?).

Hopefully others with domain expertise on pricing decisions can weigh in on the best strategy for these data. If not, the usual advice applies: start with as simple of a model as possible then slowly build it up to include more nuance/mechanism.

1 Like