Description
Simple matching to identify duplicates in patient records produces numerous errors for various reasons. To improve the identification of duplicates, we built an incremental model on top of an existing machine learning based Python package. We made the model updatable and scalable to accommodate an ever increasing patient record file.
Objective:
To produce an improved identification of a continuously increasing patient records database.
Problem:
Proper identification of duplicated patient information remains an arduous problem for hospitals, pharmacies and service providers. Simple matching of these records does not result in the correct identification of existing duplicates for various reasons such as noisy and incomplete records.