# I want to explain about my new Bayesian model

So, please let me post multiple times for several days.

My model concerns signal detection theory, which is the so-called FROC analysis.

I want to explain four matters. (1) Trial from which (2) data arise and (3) modeling for this data. And issues (4) I cannot overcome such as divergent transitions or selection of suitable priors to get uniform rank statistics in SBC.

Please always provide minimum working example with code. Also make sure to format your post’s code and math properly.

1 Like

Thank you for reply, I will show Stan codes later.

In this post, I explain how the following dataset arises and what it is.

Confidence Level No. of Hits No. of False alarms
3 = definitely present H_{3} F_{3}
2 = probably present H_{2} F_{2}
1 = questionable H_{1} F_{1}

where, H_{c},F_c \in \mathbb{N}.
Number of images N_I
Number of lesions N_L.

Suppose that there are a reader (doctor, radiologist), a researcher (Gold-Standard) and N_I radiographs in which there are N_L lesions.

Each radiograph contains shadows, one may be a lesion (signal, target) and one may be not a lesion (noise).
The researcher knows true lesion locations for all images but the reader does not know it.

The reader tries to find lesions from each radiographs. If reader thinks that there are suspicious lesions in a radiographs, then reader maks his suspicious location and confidence level which is a number indicating

3 = “definitely present”,

2 = “probably lesion”,

1= “questionable”.

For example, consider the following image. Two red circles mean true lesion locations and yellow triangles are reader’s suspicious locations and inner number indicating his confidence level. Because a yellow triangle with 3 is close to a true lesion denoted by a red circle, it generates a hit, namely, the number of hit with confidence level 3 is one, H_3=1.

Because a yellow triangle with 2 are far to true lesions denoted by a red circle, it generates a false alarms, namely, the number of false alarms with confidence level 2 is one, F_2=1.

Because two yellow triangles with 1 are far to true lesions denoted by red circles, it generates a false alarms, namely, , namely, the number of false alarms with confidence level 1 is 2, F_1=2. Consequently, we obtain the following table for only one radiograph.

Confidence Level No. of Hits No. of False alarms
3 = definitely present H_{3}=1 F_{3}=0
2 = probably present H_{2}=0 F_{2}=1
1 = questionable H_{1}=0 F_{1}=2

Counting this hits and false alarms among all radiographs, we will obtain ,e.g., the following table.

Confidence Level No. of Hits No. of False alarms
3 = definitely present H_{3}=97 F_{3}=1
2 = probably present H_{2}=32 F_{2}=14
1 = questionable H_{1}=31 F_{1}=74

Welcome questions.

In the next post, I give modeling for this data.

Hi, only one question: If I look at the figure I would say we should get for questionable F_1=3 and for probably present I would set F_2 = 0, i.e., I don’t see why you put triangle with 2 in probably present and not in questionable, when reading your description above. Could you explain that?

For c=1,2,3, the notation F_c means the number of false alarms (false positives: FP) with confidence level c. So the subscript c of F_c means the confidence level of the reader.

In the above example, F_1 means the number of FPs with confidence level 1 = questionable and the reader marked 2 ( not 3) locations, with confidence level 1 and they generate two FPs. Thus, F_1 = 2 (not 3).

Similarly,

F_2 means the number of FPs with confidence level 2 = probably present and the reader marked 1 ( not 0) locations, with confidence level 2 which is FP. Thus, F_2 = 1 (not 0).

In the above example F_3=0, because the reader did not mark with confidence level 3. Generally speaking, doctor does not make mistakes with a higher confidence level. So, this dataset has a trend that

F_3 < F_2 < F_1.(Do not need to be satisfied exactly.)

So, lower confidence level will generate more false alarms (FPs).

Welcome any questions. In the next post, I will explain modeling.

Is the reader providing a confidence rating of each detection themselves? Or is the confidence computed by the data analyst after the fact based on the detection point distance from a known lesion?

1 Like

Yes, the rating is done by the reader.

The reader associates each image with his confidence levels and his suspicious locations.

After the reader ’s marking trial, data analyst counts the number of True Positives (hits) and the number of Falase Positives(false alarms).

Welcome any question.

Ah, so an image receives a single confidence rating from the reader even if there are multiple detections signaled by the reader?

No, it may be occur multiple confidence rating for a single image, because the rating is done for each reader’s suspicious locations.

For example, if reader marks four locations for a single image, then reader also answers four confidence ratings for each marked locations.

Welcome any question.

1 Like

Ok, gotcha, and I see that now from the image.

Feel free to ask anything ;)

Seems like you’d want to model it as a multivariate outcome then, with response (hit/FA) as a binomial outcome and confidence as an ordered logistic or ordered probit outcome, and the two outcomes related by a latent multivariate normal. If there are conditions in the experiment, you’d then model their influence on the latent multivariate normal. But sounds like you already gave some thought to the model and are planning on posting your approach soon, so I’ll stay tuned.

1 Like

In this post,
I give a supplementary explanation about data, namely, visualization of data.

In the previous posts, I explained the following table.

Confidence Level No. of Hits No. of False alarms
3 = definitely present H_{3} F_{3}
2 = probably present H_{2} F_{2}
1 = questionable H_{1} F_{1}

where, H_{c},F_c \in \mathbb{N}.
Number of images N_I
Number of lesions N_L.

To visualize this dataset, we introduce the so-called False Positive Fraction (FPF) per image and True Positive Fraction (TPF) per lesion as follows:

(FPF_{1},TPF_{1})= \biggr( \frac{F_{3}+F_2+F_1}{N_I}, \frac{H_{3}+H_2+H_1}{N_L}\biggr),
(FPF_{2},TPF_{2})= \biggr(\frac{F_{3}+F_2}{N_I}, \frac{H_{3}+H_2}{N_L}\biggr),
(FPF_{3},TPF_{3})= \biggr(\frac{F_{3}}{N_I}, \frac{H_{3}}{N_L}\biggr).

These points can be plotted as follows:

In the figure, there is an estimated curve which is defined in the later post.
I made a model such that

1. the model can define such a curve (defined later) without changing the classical FROC theory,
2. the model can generate a dataset (the above table), namely, a generated dataset satisfies the restriction that H_1+H_2+H_3 \leq N_L, where H_c is the number of hits (True Positives) with c-th confidence level and N_L is the number of lesions (signals, targets).

Visualization gives me the most rough but intuitive test that the model fit to data.

In the next post, I will define a model.

Welcome any question.

I have just one question. Is there a reason you do not use a standard classification approach, i.e., a confusion matrix, where you classify TP, FP, TN, and FN?

Yes, because TN and FN does not appear explicitly in my modeling. To focus on TP and FP only, I use more intuitive names, namely,

TP = hit and FP = false alarm ,

which are also used in page 158 of the following book:

Bayesian Cognitive Modeling: A Practical Course , Lee M.D., Wagenmakers E.-J.

In this post,
I define a model whose parameter will be denoted by \theta = (\theta_1, \theta_2, \theta_3; \theta', \theta '') where \theta_1, \theta_2, \theta_3 \in \mathbb{R} and \theta' \in \mathbb{R}^n, \theta'' \in \mathbb{R}^m. for some n,m.

Recall that our dataset is given by the following form.

Confidence number of Hits (TP) number of False alarms (FP)
3 = definitely present H_{3} F_{3}
2 = probably present H_{2} F_{2}
1 = questionable H_{1} F_{1}

where, H_{c},F_c \in \mathbb{N}. Number of images N_I . Number of lesions N_L.

For this data, I defined the following model.

H_3 \sim \text{Binomial}(p_3(\theta), N_L),

H_2 \sim \text{Binomial}( \frac{p_2(\theta)}{1-p_3(\theta)}, N_L-H_3),

H_1 \sim \text{Binomial}( \frac{p_1(\theta)}{1-p_3(\theta)-p_2(\theta)}, N_L-H_3-H_2),

F_3 \sim \text{Poisson}(q_3(\theta)N_I),

F_2 \sim \text{Poisson}(q_2(\theta)N_I),

F_1 \sim \text{Poisson}(q_1(\theta)N_I),

where,

p_c(\theta) := \int_{\theta_c}^{\theta_{c+1}}P(x|\theta')dx,

q_c(\theta) := \int_{\theta_c}^{\theta_{c+1}}Q(x|\theta'')dx,

and P(x|\theta') denotes a probability density function and Q(x|\theta'') is a positive function.
For example, we use

P (x|\theta') =\text{Gaussian}_{ }(x|\mu,\sigma) = \frac{1}{\sqrt{2\pi \sigma}} \exp -\frac{(x-\mu)^2}{2\sigma},
Q (x|\theta'') = Q (x) = N_X \times \frac{d \log \Phi(x)}{dx},

where \Phi denotes the cumulative distribution function of the standard normal distribution and in this case \theta' = (\mu,\sigma) and \theta'' is nothing. Hence model parameter is (\theta_1, \theta_2, \theta_3;\mu, \sigma).

The reason why I defined a model as above is to define an alternative notion of ROC curve ( which
is the so-called FROC curve) without changing the classical FROC theory. I think the motivation of these definition can be understood by the process of derivation of the definition of the FROC curve in the next post.

I welcome any question.

1 Like

In this post, I give the visualization of model.

## Recall that our data is give as follows.

Confidence Hits False alarms
3 = definitely present H_{3} F_{3}
2 = probably present H_{2} F_{2}
1 = questionable H_{1} F_{1}

where, H_{c},F_c \in \mathbb{N}. Number of images N_I . Number of lesions N_L.

## To visualize this dataset, we introduced the so-called False Positive Fraction (FPF) per image and True Positive Fraction (TPF) per lesion as follows:

(FPF_{1},TPF_{1})= \biggr( \frac{F_{3}+F_2+F_1}{N_I}, \frac{H_{3}+H_2+H_1}{N_L}\biggr),
(FPF_{2},TPF_{2})= \biggr(\frac{F_{3}+F_2}{N_I}, \frac{H_{3}+H_2}{N_L}\biggr),
(FPF_{3},TPF_{3})= \biggr(\frac{F_{3}}{N_I}, \frac{H_{3}}{N_L}\biggr).

We plot these three points for visualization of data.

## For this data, I defined the following model.

H_3 \sim \text{Binomial}(p_3(\theta), N_L),

H_2 \sim \text{Binomial}( \frac{p_2(\theta)}{1-p_3(\theta)}, N_L-H_3),

H_1 \sim \text{Binomial}( \frac{p_1(\theta)}{1-p_3(\theta)-p_2(\theta)}, N_L-H_3-H_2),

F_3 \sim \text{Poisson}(q_3(\theta)N_I),

F_2 \sim \text{Poisson}(q_2(\theta)N_I),

F_1 \sim \text{Poisson}(q_1(\theta)N_I),

where,

p_c(\theta) := \int_{\theta_c}^{\theta_{c+1}}P(x|\theta')dx,

q_c(\theta) := \int_{\theta_c}^{\theta_{c+1}}Q(x|\theta'')dx,

and P(x|\theta') denotes a probability density function and Q(x|\theta'') is a positive function.

## Visualization of Model

Because model gives us the probability law for the random variables H_c,F_c,c=1,2,3 we can calculate their expectations as follows.

\mathbb{E}[H_c/N_L] = p_c(\theta),
\mathbb{E}[F_c/N_I] = q_c(\theta).

Hence, the expectations of FPF and TPF (defined in the previous posts) is

x_c:=\mathbb{E}[\text{FPF}_c] =\Psi_Q(\theta_c),

y_c:=\mathbb{E}[\text{TPF}_c]=\Psi_P(\theta_c),

for c=1,2,3., where \Psi_P and \Psi_Q denote the cumulative distribution functions of P(x|\theta') and Q(x|\theta'').
Thus, the expectations of the FPF and TPF lie on the following set.

\{ (x,y) \in \mathbb{R}_{> 0} \times \mathbb{R} ; x= \Psi_Q(t), y=\Psi _P(t), t \in \mathbb{R} \}.

Regarding this set as a curve, we obtain the so-called FROC curve.

Because expectations of FPFs and TPFs lie on the FROC curve, we also expect all pairs of FPF and TPF lie on the FROC curve. In the figure three points ( circles) are pairs of FPF and TPF and the curve is the FROC curve.

In the next post, I will give priors.
In later posts, I want to explain how to fit the model to data. I am the author of the package BayesainFROC which is now on CRAN. I made a Graphical User Interface, so fitting is very easy. I will explain this GUI and issues about prior.

1 Like

By the follwoing R script, we can get the GUI.

install.packages(“BayesianFROC”);
library(BayesianFROC);
fit_GUI_Shiny()

GUI.

By enter any data in the red frame in the figure, we can fit a model to the data.

Confidence number of Hits (TP) number of False alarms (FP)
3 = definitely present H_{3} F_{3}
2 = probably present H_{2} F_{2}
1 = questionable H_{1} F_{1}

where, H_{c},F_c \in \mathbb{N}. Number of images N_I . Number of lesions N_L.

For this data, I defined the following model.

H_3 \sim \text{Binomial}(p_3(\theta), N_L),

H_2 \sim \text{Binomial}( \frac{p_2(\theta)}{1-p_3(\theta)}, N_L-H_3),

H_1 \sim \text{Binomial}( \frac{p_1(\theta)}{1-p_3(\theta)-p_2(\theta)}, N_L-H_3-H_2),

F_3 \sim \text{Poisson}(q_3(\theta)N_I),

F_2 \sim \text{Poisson}(q_2(\theta)N_I),

F_1 \sim \text{Poisson}(q_1(\theta)N_I),

where,

p_c(\theta) := \int_{\theta_c}^{\theta_{c+1}}P(x|\theta')dx,

q_c(\theta) := \int_{\theta_c}^{\theta_{c+1}}Q(x|\theta'')dx,

and P(x|\theta') denotes a probability density function and Q(x|\theta'') is a positive function.
For example, we use

P (x|\theta') =\text{Gaussian}_{ }(x|\mu,\sigma) = \frac{1}{\sqrt{2\pi \sigma}} \exp -\frac{(x-\mu)^2}{2\sigma},
Q (x|\theta'') = Q (x) = \frac{d \log \Phi(x)}{dx},

where \Phi denotes the cumulative distribution function of the standard normal distribution and in this case \theta' = (\mu,\sigma) and \theta'' is nothing. Hence model parameter is (\theta_1, \theta_2, \theta_3;\mu, \sigma).

I implemented the following non-informative prior.

\theta_1, \mu \sim \text{Normal}(0,111),
\sigma, d\theta_c=\theta_{c+1}-\theta_{c} \sim \text{Uniform}(0,111), for c=1,2.

This prior has good perspective, because the model can fit to various datasets.
However, the prior dose not generate suitable model parameters. For example, I used the Poisson and Binomial distributions in my model, whose rate parameter is generated by the prior but it is sometimes zero. To generate suitable model parameter from a prior, I should find the prior, but now I have not succeeded, yet.

For example,
I want to get the prior such that the generated model paraeter \theta has the following properties

p_3 (\theta) > \frac{p_2(\theta)}{1-p_3(\theta)} > \frac{p_1(\theta)}{1-p_3(\theta)-p_2(\theta)},

which means that the reader find more lesions if the confidence level is larger.

Similarly,

q_1(\theta) > q_2(\theta) >q_3(\theta)

which means that the reader fails more if the confidence level is smaller.

My model is bad SBC rank statistics with respect to the above prior. I guess this reason is that the prior dose not generate suitable model parameters.

If someone has any idea, please let me know.

1 Like

In this post, I explain when divergent transitions occur in the above model.

The following data has two zero cells.
For such data, divergent transitions occured, namely F_2=F_3=0.

I think the divergent transitions also causes strange data.

Confidence number of Hits (TP) number of False alarms (FP)
3 = definitely present H_{3}=97 F_{3}=0
2 = probably present H_{2}=32 F_{2}=0
1 = questionable H_{1}=31 F_{1}=74

Number of images N_I=57 .

Number of lesions N_L=259.

To fit a model to the above data, please use the following code which is a GUI. By inputting data, rstan will run and the result will appear.

library(BayesianFROC);
fit_GUI_Shiny()


So, my package has two issues.

1. Find the good prior in the sense of SBC
2. Remove the divergent transitions for many zero cells data.

Until now, we assumed that there is a single reader and did not focus on imaging methods ( MRI, CT, PET, etc which are called modality).
The aim of signal detection theory is comparison of modalities.
For example,
if multiple readers find more lesions from an image taken by MRI than an image taken by CT, then we can say that the MRI is better than CT.

In the next post, I want to explain how to compare observer performance ability and modeling with multiple readers and multiple modalities.
To include the heterogeneity of readers and modalities, we use a hierarchical Bayesian model.
But such hierarchical model has very bad MCMC sampling. For example, divergent transitions or non-convergent chains. To quantify observer performance abilities, we use the notion of area under the curve (AUC). However, area under the FROC curve is not bounded, so, we introduce a new curve (which is called alternative FROC curve (AFROC)) which is a bounded curve and thus AUC is well defined.

Please let me here any opinion on the performance of this hierarchical model

Your model is far too complex for me to pretend to understand it, but here is a question: is your likelihood well-defined when some cell counts are zero?