How to simulate the confidence which was measured by participants' self-reports?

Gideon · April 16, 2025, 11:22am

Hi everyone, I have a question. In my experiment, we collected participants’ trial-by-trial confidence through subjective reports and included it in the model as a variable affecting option value. However, during the simulation, I encountered a problem. It seems that without specific assumptions, I can’t simulate this entirely subjectively reported data. Does this mean that I can only simulate new choice data by inputting the original confidence and can’t simulate the confidence data itself?

Bob_Carpenter · April 16, 2025, 9:47pm

Hi, @Gideon, and welcome to the Stan forums.

The more general point is that you can never generate data without specific assumptions. This often rears its head in modeling for covariates that aren’t modeled. For instance, when we set up a linear regression of y on x, we often don’t directly model the x values (because the parameter estimates are conditionally independent of these values, we don’t need to model them to estimate the gression). As a result, we can’t generate new x values.

You can just generate new y data for fixed x values. To generate new data, you need some statistical model of the covariates x on which to base the simulations. Here, it’s not unusual to adopt a simple GLM for the data you want to model. For example, if it’s univariate data, model the mean and variance and use that to simulate new instances, if it’s multivariate, use a multivariate normal. If it’s constrained to be positive, use a multivariate model and then transform with an exponential; if it’s constrained to be a probability, use inverse logit.

Gideon · April 17, 2025, 11:12am

Thank you for your reply! I am a graduate student who is just starting out in computational cognitive neuroscience. I would like to ask if you could recommend any resources for systematically and deeply learning about modeling.

martinmodrak · April 17, 2025, 2:37pm

Just to quickly add to Bob’s good answer - there are a couple common options for picking values for unmodelled covariates (like confidence in your example) when those affect the inferences you are making (e.g. when you model an interaction between the unmodelled covariate and the effect of interest):

Pick the average/median value in your data and report inferences for a hypothetical “average” subject.
Pick a theoretically appealing value (like high confidence).

For both cases, it is often useful to actually choose multiple values, e.g. some “low”, “mid” and “high” values and show how your inferences differ between those.

Here’s an example I wrote recently (mildly edited for clarity):

under standard conditions, the median success rate was 66%. For a subject with this success rate, the model estimates a 95% CI for success rates under treatment condition as (57.07%, 58.76%), representing an approximate 8-percentage point decrease or a ratio of 0.88. Contrast that with a subject in the lower decile of success rate (53%) where the expected success under treatment is 49%, i.e. a decrease of roughly 4 percentage points (ratio = 0.93). For a subject in the upper decile of success rate (78%) the expected success rate under treatment is 67% , i.e. a decrease of roughly 11 percentage points (ratio = 0.86).

If you are into really detailed (and thus lengthy) treatment, Mike Betancourt’s case study is IMO a good start: (Co)variations On A Theme

Hope that helps at least a little bit

Bob_Carpenter · April 17, 2025, 9:58pm

The place we usually recommend people start is with Richard McElreath’s book, Statistical Rethinking. It’s less math-heavy and a bit more up-to-date than Gelman et al.'s Bayesian Data Analysis.. Personally, I got a lot out of Gelman and Hill’s original multilevel regression book—I haven’t read their new work, Regression and Other Stories.

We like our general Workflow paper, which is on arXiv. Gelman and Vehtari are working hard right now to try to turn that into a book.

We have a lot of case studies around Stan and the User’s Guide has Stan-based introductions to a lot of different modeling techniques.

Here’s a link to similar info without the books on the web site:

Gideon · April 17, 2025, 11:32pm

This is very helpful. Thank you for your advice

Gideon · April 17, 2025, 11:34pm

Thanks for your suggestions!

Topic		Replies	Views
Logit Choice Set Up and Comparison Modeling loo , economics	2	720	January 11, 2021
How to simulate data? General	6	798	January 23, 2020
Model Advice Modeling	5	407	November 15, 2019
Bayesian Inference on Nongenerative Model Modeling techniques , fitting-issues , specification , performance	1	376	February 24, 2023
Sbc for linear regression General simulation-based-calibration	3	910	July 15, 2019

How to simulate the confidence which was measured by participants' self-reports?

Related topics