Page 38 - PATIENT REGISTRY DATA FOR RESEARCH: A Basic Practical Guide
P. 38

3.5 Proper Handling of Missing Data in a Registry Database

                       Missing data usually occur when no data is available for reporting a variable within


               the data set (see Figure 3.1). Missing data commonly occurs during the data collection

               process, especially for a patient registry database, and which can potentially lead to

               insufficient power for the registry study (due to an inadequate sample size resulting from


               missing data) (Kim & Curry, 1977). The presence of missing data can also render some

               common statistical analyses either invalid and/or unfeasible, and can also introduce a


               potential source of bias into the estimates derived from a statistical model (Rubin, 1987;

               Becker & Walstad, 1990). Therefore, it is necessary to handle these missing data by using


               appropriate analytical strategies to analyse the remaining set of incomplete data.

                       Missing data are categorized as either 'missing completely at random' (MCAR) or

               'missing not at random' (MNAR) (Rubin, 1976). Missing data will be considered as MCAR


               when their occurrence is not influenced by other variables. For example, MCAR can happen

               when some questionnaires have been lost by accident, and also when the respondents have


               unintentionally overlooked some questions, or when the specimen container has been

               damaged by accident (which resulted in a loss of the results due to attrition of sample


               collected by the investigation). In these scenarios, the simplest technique for handling

               missing data will involve the use of ad hoc methods such as complete case analysis and


               available case analysis (pairwise deletion) in order to give unbiased results (Greenland &

               Finkle, 1995).


                       Unlike the MCAR data, the MNAR data are influenced by certain factors. For

               example, when asking a patient for his or her income level, the data may be more likely to be

               missing when the income is extremely high. The reason for this missing data is obviously


               unrelated to any visible patient characteristic. Another situation in which MNAR can also

               occur is when the resources are not available for a particular specimen collection. In these
   33   34   35   36   37   38   39   40   41   42   43