Missing data

Munir · October 6, 2018, 7:47am

Hi,

I have observations y at different stations over time, so y [s,t]. s=1 to 426, t = 1:65
But sometimes, some stations have missing data for particular times, say y[10, 12] = NA
How I represent such data in stan in most clear way?
Thanks in advance.

Best
Munir

Guido_Biele · October 6, 2018, 12:44pm

The simplest way is to

Create vectors with row and column indices for the missing data.
replace missing values in y with some number before including it in the standata list (assuming you are using R)

In Stan.
3. Have a parameter variable “imputed_data” with as many elements as you have missing data.
4. “Sample” these parameters from an appropriate prior (you won’t be able to generate integer valued missing data in this simple approach)
5. In the model block specify a new variable y_imputed, where you set all non-misaung values to the original values and filll in the missing values from the imputed_data variable.
6. Evaluate the log posterior using y_imputed.

This is the most basic approach. There are more sophisticated ways to generate the imputed values, using for example multivariate normal distribution or regression on other variables.

Topic		Replies	Views
Missing data Modeling	8	553	May 8, 2020
Can't understand an example for handling missing value in rstan Modeling rstan , missing-data	1	833	June 26, 2022
How to handle missing values in Stan Modeling	2	725	November 30, 2021
Treating missing data "NA" in functioning model Modeling	1	689	September 9, 2018
Missing data in Stan - some difficulties understanding Modeling	6	575	August 16, 2021

Missing data

Related topics