Modified Apriori Algorithm for the Diagnosis of Tuberculosis
Keywords:
Tuberculosis diagnosis, Classification, Association Rule Minin, Apriori AlgorithmAbstract
Data mining is a technique that involves the knowledge discovery and analysis of patterns within extensive databases to extract new information that may be difficult to identify otherwise. It is an interdisciplinary topic of computer science and statistics that seeks to extract information from a dataset and convert it for further use. Data mining has widespread applications in healthcare sector in the analysis of potential outcomes and relationships among the variables in the healthcare dataset, enabling professionals to predict patterns in patients' medical conditions and behaviours.In the recent decades, data mining has been extensively utilized in the prediction and diagnosis of diseases. Datamining algorithms has the potential to offer distinct approach to aid in the diagnosis of several critical illness including tuberculosis (TB). Association Rule Mining (ARM) is the commonly used data mining methodology for uncovering intriguing and unforeseen rules from large data sets. This method generates a substantial number of rules, some of which are intriguing while others are redundant. It restricts the evaluation of rules to just two metrics: support and confidence. Association rule mining (ARM) is an effective method for identifying relationships in datasets, with the Apriori algorithm being one of the most used and impactful algorithms in this domain. This research aimed to create a predictive model using the modified Apriori algorithm for diagnosing pulmonary tuberculosis. A preliminary diagnosis was established exclusively on patient demographic data, medical history, and physical examination findings. Experiments were conductedto assess the performance of individual classifiers such as Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), C4.5 Decision Tree Classifier, K-Nearest Neighbour (KNN) Algorithm, Binary Logistic Regression (BLR), k-means, Apriori and the proposed modified Apriori based on parameters like accuracy, precision, sensitivity, specificity, recall, and F-measure. The data for the experiments were obtained from the medical records of TB patients across several hospitals in the Chennai area, Tamil Nadu, India. The results demonstrated that the modified Apriori approach has outperformed other individual classifiers across all assessment metrics.