Missing Data and Skip Patterns in Statistics

Words: 553 Pages: 2

Table of Contents

Introduction: Importance of Handling Missing Data and Skip Patterns
How Missing Data Affects Data Analysis
Appropriate Handling of Missing Data, Explanation, and Example
Conclusion
References

Introduction: Importance of Handling Missing Data and Skip Patterns

In statistics, missing data occur when there is no value of data stored for a specific variable in observation, which means that no information is provided for a subject (quantitative research). Skipping patterns refer to one or several questions linked to a conditioned response, allowing respondents to skip questions that do not relate to them (surveys used in qualitative research). To manage both types of missing values, the researcher must properly handle them:

Complete research protocol;
Design a data analysis plan;
Document the performed missing data analyses;
Log arguments and decisions for handling missing data (Heymans, 2015).

It is important to handle missing data and skip patterns because they are not included in data analysis (IDRE, 2012). If cases or results are deleted, it may lead to smaller sample size and larger errors in research.

How Missing Data Affects Data Analysis

During data analysis, missing data can lead to biased parameters estimation, information loss, decreased statistical power, weakened findings’ generalizability, and increased standard errors (Dong & Peng, 2013). Therefore, missing data reduces the power of a research trial. On the other hand, some amount of missing data is expected in the course of a study, so researchers tend to increase the target sample size to allow for it (Kang, 2013). Despite such efforts, it is impossible to eliminate the bias that results from it, which means that researchers should pay extra attention to missing data during both the design of the studies and the analysis of data.

Appropriate Handling of Missing Data, Explanation, and Example

Appropriate management and handling of missing data begins with effective planning and continues with accurate data collection (Kang, 2013). A useful technique for handling missing data is using data analysis methods “which are robust to the problems caused by missing data” (Kang, 2013, p. 405). However, these methods are often inaccessible to researchers, so it is advised to use alternative methods, such as mean substitution. In this approach of handling, the average value of a study variable replaces the missing data value for the same variable, which allows the researcher to use all data that has been collected (Kang, 2013). It is recommended to use mean substitution because the mean usually acts as a reasonable estimation for a normal distribution observation. For example, mean substitution can be used if the researcher needs to substitute the mean among young patients’ outcomes for missing values of young patients’ outcomes (de Goeij et al., 2013).

Conclusion

It has been identified that missing data presents some major challenges for the results of the research. When relevant data is missed, researchers get less power to find significant results, as well as get a smaller chance to accept the alternative effect hypothesis correctly. While there are many ways in which missing data can be managed (Allison, 2012), there is still no one unified approach that could eliminate all issues altogether, so techniques such as mean substitution can offer researchers ways in which they can manage the missing data. In conclusion, it is important to mention that avoiding missing data begins with careful research planning and subsequent appropriate analysis of data; therefore, it is imperative that researchers pay attention to the stage of planning in order to prevent missing data from occurring.

References

Allison, P. (2012). Missing data. In R. Millsap & A. Maydeu-Olivares (Eds.), The SAGE handbook of quantitative methods in psychology (pp. 72-89). Washington, DC: Sage Publications.

de Goeij, M., van Diepen, M., Jager, K., Tripepi, G., Zoccali, C., & Dekker, F. (2013). Multiple imputation: Dealing with missing data. Nephrology, Dialysis, Transplantation, 28(10), 2415-2420.

Dong, Y., & Peng, C-Y. (2013). Principled missing data methods for researchers. Springer Plus, 2, 222-230.

Heymans, M. (2015). Handling missing data. Web.

IDRE. (2012). Multiple imputation in stata. Web.

Kang, H. (2013). The prevention and handling of the missing data. Korean Journal of Anesthesiology, 64(5), 402-406.