How to validate the model containing the parameter which not relate the data?

Jean_Billie · February 15, 2019, 7:32am

Consider the following generative model to explain how the data arise.
The following figure, theta 1 and theta 2 mean model parameters.
In the model, theta 2 does not effect the data.

To validate the model, one method is to replicate the data from a known parameter and show that the mean of estimates over replications and true parameter does not differ.

The following model, the second parameter theta 2 does not affect directory the generative process of data. So, we no need to assume the true parameter theta 2 to generate the replicated data.
So, lack of the true parameter theta 2 leads us the impossibility of comparison of estimates and true parameter ?

bgoodri · February 15, 2019, 4:48pm

Is it possible to consider \theta_2 and f as unknown parameters and solve for \theta_1 as a transformed parameter?

Jean_Billie · February 16, 2019, 6:56am

Thanks for letting me know the idea. But I deem it is difficult to obtain some function \varphi() such that \theta_2 = \varphi( \theta_1)( = \varphi_f( \theta_1) ).

I build some hierarchical Bayesian Model with rstan and reviewers required the validation of my model. I showed some compatibility of existing methods and my proposed methods, however reviewer dose not believe such existing methods. So, I have to show the compatibility with truth !!

avehtari · February 16, 2019, 8:57am

I did not understand your model, but

see Talts et al (2018) Validating Bayesian inference algorithms with simulation-based calibration for discussion on this topic.

Jean_Billie · February 16, 2019, 10:26am

Thank you for letting me know the paper.
I will try to validate my model along the paper.
I really do not know such validation methods, so it may helps me.

Thank you !!

Jean_Billie · February 23, 2019, 10:56am

I read it.

Roughly speaking, this paper construct some test statistics which is uniformly distributed under the null hypothesis that the MCMC sampling is correct. And if MCMC sampling is not correct, then the histogram of the test statistics become skew shape and this deviation from uniformity tells us the MCMC contains bias. I want to implement but it needs to calculate the above quantities.

I have a question:

Is this method available for improper priors or improper posteriors.
How to choice the function f for the rank statistics:

I am not sure but

To focus on only one parameter then I think f=f_i=f_i(\theta_1,\theta_2,\cdots,\theta_n)=\theta_i ?.
To pool all prameters, the Euclid norm ?f=f(\theta_1,\theta_2,\cdots,\theta_n)= \sqrt{ \sum \theta_i ^2 }

avehtari · February 24, 2019, 9:16am

No.

What is you indexing here? You can focus on one parameter or any scalar quantity computed from all parameters.

Jean_Billie · February 24, 2019, 10:36am

Thank you for reply !!

Now, My model use improper priors for standard deviations or means for Gaussian. So for application of the SBC (Simulation based Calibration) I must use more stronger priors.

In the above, I use (\theta_1,\theta_2,...\theta_n) for the parameters of model (Not MCMC samples). That is (\theta_1,\theta_2,...\theta_n) \in \Theta. I am not sure how the rand statistics is affected from the definition of f.

avehtari · February 24, 2019, 12:48pm

We highly recommend to use proper priors in all cases.

I get this

But I don’t understand this

Jean_Billie · March 1, 2019, 11:59am

I implement the Simulation Based Calibration (SBC) using f:\Theta \to \mathbb{R};(\theta_1,..\theta_d) \mapsto \sum\theta_i^2 for the rank statistics described in the following paper.

Then the hist gram of the rank statistics is as follows:

It shows that the histogram is far from uniformity.
I think the reason why the histogram is far from uniformity is the misspecified priors described in the section 6.1 in the paper. If data is plausible then by the function rstan::rstan::check_hmc_diagnostics() MCMC procedure is correct. However, in my model, if data from likelihood (model) with parameters from priors are not plausible, then the MCMC sampling contains divergent transitions in my model.

Generally speaking, the frequentist method is not use the prior, ( except lasso, Firth, ridge…) and my model also do not need priors when I fit my model to plausible data. But to implement the Simulation Based Calibration, I have to chose appropriate priors for drawing the plausible data.

In my case, the SBC give me validation of priors more than the validation of MCMC procedure.

Topic		Replies	Views
Validation of model Modeling	5	686	September 9, 2019
How do I evaluate my model, in terms of bias, coverage, etc.? General simulation-based-calibration	17	2885	January 4, 2023
Model inputs as parameters of a distribution PyStan	14	1076	September 19, 2021
Normal distribution parameter estimation with uncertainty Modeling fitting-issues	8	2463	May 22, 2017
Can I trust the MCMC results? Modeling fitting-issues	5	412	August 10, 2020

How to validate the model containing the parameter which not relate the data?

Related topics