Missing data are very common in all types of data sets, but most statistical procedures assume that all data are observed. The result is that most standard statistical software packages automatically drop from the analysis all observations with any missing data. This approach can lead to very low sample sizes and biased results. This two-part workshop will discuss the mechanisms that cause missing data, advantages and disadvantages of commonly-used approaches to dealing with missing data, some new and better approaches, and how to implement them with available statistical software.
The workshop is intended for participants who have the equivalent of two semesters of statistics and some previous experience doing data analysis. It is appropriate for faculty, research staff, and graduate students.
- Missing data mechanisms and why they are important
- An overview of imputation techniques
- Multiple Imputation
- Maximum Likelihood and the EM Algorithm
- Software for implementing these solutions
Participants in the workshop will:
- Understand the processes that lead to missing data
- Learn approaches for handling missing data and their advantages and disadvantages
- Be able to put into practice some of these approaches
Fee: None to members of the Cornell community, but registration is required. Since space is limited, early registration is encouraged.
Files for Workshop:
For times and locations of upcoming workshops, please see the Workshop Schedule.