Correcting for non‐ignorable missingness in smoking trends


Data missing not at random (MNAR) are a major challenge in survey sampling. We propose an approach based on registry data to deal with non‐ignorable missingness in health examination surveys. The approach relies on follow‐up data available from administrative registers several years after the survey. For illustration, we use data on smoking prevalence in Finnish National FINRISK study conducted in 1972–97. The data consist of measured survey information including missingness indicators, register‐based background information and register‐based time‐to‐disease survival data. The parameters of missingness mechanism are estimable with these data although the original survey data are MNAR. The underlying data generation process is modelled by a Bayesian model. The results indicate that the estimated smoking prevalence rates in Finland may be significantly affected by missing data.

Stat journal, 4(1)

Supplementary notes can be added here, including code and math.

Source Themes
Juho Kopra
University Lecturer of Statistics

My research interests include Bayesian statistical methods, applied statistics for problems with high societal impact.