Page 41 - PATIENT REGISTRY DATA FOR RESEARCH: A Basic Practical Guide
P. 41

The easiest way to handle missing data is to simply declare them as missing. A unique

               code will be introduced to represent the missing values across all the variables. Assuming the


               researcher uses the code "0" to define missing, and at the same time the same code "0" is also

               used in another variable to represent answer "No" (for example: "1" for yes and "0" for no);


               then another distinctly different code such as "9999" will be a better option to represent the

               missing value instead of "0". In some instances, it is possible for several different codes to


               represent the missing values, albeit each with a different definition. For example, code

               "9999" can be used to represent the true missing value, whereas code "8888" can be used to


               indicate that the variable is not relevant or not applicable to a particular patient (such as the

               'pregnant' status for a male patient). Hence, it becomes necessary to adopt a different


               approach for the analysis of these two different types of missing values.

                       Irrespective of how missing data will be dealt with, they can always be easily detected

               during the data cleaning process. Simple descriptive analysis such as the percent frequency


               (%) will be able to detect the total number of missing data that are found in a registry

               database. Then, the researcher shall need to identify to which individual patient the missing


               values actually belong by basing them on the individual identifiers such as the patient

               identification number. Finally, the researchers will need to obtain a consensus among


               themselves on the most appropriate way to handle the missing values. After having identified

               the best way of handling the missing values, these missing values can then be replaced


               accordingly by using an appropriate imputation technique. In addition, a full description of

               the way in which the missing data are being replaced shall also be provided in the study


               report or manuscript, in order to ensure that the selection of any imputation techniques that

               have been applied for replacing the missing data are fully justified by the researchers and are

               also made clear and transparent to the reader.
   36   37   38   39   40   41   42   43   44   45   46