Robustness (Stability) of Bayesian sense

I have implemented two models for signal detection theory, namely I defined two likelihoods f_1(y|\theta), and f_2(y|\theta), where y denotes data and \theta a model parameter.

The data y consists of 6 non-netagive valued random variables, i.e., y=(H_1,H_2,H_3,F_1,F_2,F_3) which can be displayed by the following table.

Consider two datasets y_1 and y_2, one of which is obtained as the perturvation of the other.

data y_1

Confidence Level Number of True Positives Number of False Positives
3 = definitely present H_{3}=97 F_{3}=0
2 = equivocal H_{2}=32 F_{2}=\color{red}{0}
1 = questionable H_{1}=31 F_{1}=74

data y_2

Confidence Level Number of True Positives Number of False Positives
3 = definitely present H_{3}=97 F_{3}=0
2 = equivocal H_{2}=32 F_{2}=\color{red}{1}
1 = questionable H_{1}=31 F_{1}=74

Because these two datasets are similar, so I expect that the posterior estimates is also similar. But, in one model f_1(y|\theta), the estimates or posterior distribution is unstable and esimates (such as posterior means or posterior variance) are dramtically changed as follows.

Estimates for data y_1

Param name mean se_mean sd
z[1] -5.924416e-01 1.185005e-02 0.12219574
z[2] 8.760078e+305 NaN Inf
z[3] 1.860283e+306 NaN Inf

Estimates for data y_2

Param name mean se_mean sd
z[1] -6.197717e-01 6.744095e-03 1.213770e-01
z[2] 2.440500e+00 4.749197e-02 5.032560e-01
z[3] 6.120099e+00 1.652269e-01 1.383218e+00

So, I want to quantify this robustness.

So my question is :

Is there any famous way to show this robustness?
or any charcteristics for robustness or books?

The codes is open to public, if necesarry, I show it.

To tell the truth, my new model has robustness, so I want to show this robusness as a venefit of my new model.

Is there another question on Bayesian analysis.
In the above fitting, some constant posterior chains appear, then R hat is NaN, but when I visualize the estimates with constant chains, the estimates looks reasonable, because the fitted curve are close to data-points. So, How should we consider these constant chains.
So, another question is:

Are fitted estimates not reliable if it includes constant chains?

Do you have the raw data? This would be consisting of many individual “trials” where on each trial you have the stimulus state (present vs absent), the response (present vs absent), and the confidence response (1,2, or 3). If you do have this data, you might achieve more stable inference by modelling it as (in brms/lm style formula): response ~ stimulus*confidence with a probit or logit link function.(A paper demonstrating a non-Bayesian but hierarchical version of this is here)

With that set up, and if stimulus has been coded with sum contrasts, the intercept reflects overall response bias, while the effect of stimulus reflects the discriminability (literally d’ if you use a probit link). Depending on what assumptions you feel you’d be warranted to make, you could either leave confidence as a numeric predictor or maybe an ordinal with latent distribution and cut points. In either case, the main effects of confidence will reflect how confidence influences response bias while the interactions with stimulus will reflect how confidence influences discriminability.

Frankly, now that I express it that way, I wonder if a more useful/sensible model of this data would be to not consider confidence as affecting discrimination performance (which seems backwards), but use the discrimination coefficient to predict confidence as an ordinal outcome. So (switching to Stan style expression):

response ~ bernoulli_logit(alpha*stimulus) ;
confidence ~ ordered_logistic(beta*alpha,cutpoints);
1 Like

Actually, my data can be decomposed into data of many trials as follows. The each data will be more sparse.

1st trial

confidence Hit False Alarm
3 2 0
2 0 1
1 0 1

2nd trial

confidence Hit False Alarm
3 0 0
2 1 0
1 0 1

…

and so forth.

The datasets shown in the previous post are obtained by summing up data of all trials. This summation can be regarded as an undesirable reduction of informations and it is not reasonable and avoidance of the reduction is one of my biggest concern. If I apply the scheme in your suggested paper, the data should be decomposed into each trials and introduce new heterogeneity among each trials so that , e.g., decision thresholds can depend on each trial.

Your suggesting model looks like a generalized linear model as link functions, bernoulli_logit and ordered_logistic. Also, stimulus is an indicator variable whether each trial includes signal (stimulus) or not. alpha , beta denotes a model parameter for slop and intercept in the generalized linear model. cutpoints also denotes a model parameter. The link function ordered_logistic is quite new for me and It is very very very interesting to me. In your modeling, the confidence levels are considered as random variables from a distribution whose parameters are cutpoints, and it makes sense to me and such a modeling is very very very very interesting to me, the most biggest interesting part! I guess, confidence can also be regarded as explanatory variables but this would be boring.

In my signal detection paradigm, traditionally, non-linear models are used. So, if use linear model, I also have to develop the visulaization methods, such as an analytic expression of fitted curves to data as an alternative notion of ROC curve and so on.

At this time, I don’t know whether it will be a model that converges well and can define visualization methods such as an alternative ROC curve.