Page 33 - PATIENT REGISTRY DATA FOR RESEARCH: A Basic Practical Guide
P. 33

3.2 Match and Combine Data from Different Datasets

                       It is not uncommon for data from a registry to be stored by various separate sections


               within a registry database. Therefore, it will be necessary to combine these sections together

               to form a single data set. For example, the National Cardiovascular Disease Registry (NCVD)

               has two main forms: namely the notification form and the follow-up form. Data obtained


               from both forms will be stored in two separate sub-databases. These two sub-databases will

               have to be merged during analysis since some of the results will require input from both


               sub-datasets (Wan-Ahmad & Liew, 2016).

                       Meanwhile, it is also possible for some of the study objectives to necessitate the


               establishment of a linkage between the registry data and the data from an external data set,

               such as matching the original registry data with the data obtained from the National Death

               Record to determine the survival rate of the patients (Wong & Goh, 2016). To link together


               the data obtained from two different data sets, both data sets will need to have unique

               identifiers which can be matched by either deterministic or probabilistic matching, which are


               two different strategies for record linkage or data matching. When the investigators are

               dealing with sensitive information such as patients' identifiers, they should ensure that prior


               consent has been obtained from the patients and all the necessary approvals have been

               granted by the respective authorities.


                       Before performing the matching process, the identifiers have to be distinctly unique in

               both data sets. Although matching can be performed by using statistical software, however it


               is still necessary to perform a validation step by conducting several random checks on these

               records to make sure that exact matching had been performed. This can be achieved by

               obtaining a random sample of several matched records and comparing them against the


               source data from the original data sets.
   28   29   30   31   32   33   34   35   36   37   38