Adaptive Data Enrichment Pre-Processing System (Adeps) For Duplicate Detection, Outlier Handling, Imputation, And Encoding

Authors

  • R. Nisha Research Scholar, Hindusthan College of Arts and Science, Coimbatore, Tamilnadu, India.
  • G.Dalin Professor, Hindusthan College of Arts and Science, Coimbatore, Tamilnadu, India.

Keywords:

Machine learning, duplicate detection, outlier handling, imputation, and categorical encoding

Abstract

The Adaptive Data Enrichment Pre-processing System (ADEPS) is a comprehensive and flexible framework designed to optimize data quality for analytical and machine learning tasks. ADEPS integrates four critical preprocessing functions: duplicate detection, outlier handling, imputation, and categorical encoding. Each component is developed to address common data quality issues that can adversely affect model accuracy and reliability. ADEPS’s duplicate detection uses advanced similarity algorithms to identify redundant entries, ensuring dataset integrity. Outlier handling leverages clustering and normalization techniques to effectively identify and process anomalies. For missing values, enhanced MICE-based imputation fills gaps using adaptive modeling with error terms, while categorical encoding techniques, such as Target Encoding, transform high-cardinality categorical data for machine compatibility. The ADEPS framework enhances model performance by delivering a high-quality, enriched dataset ready for robust analysis and predictive modeling. Its modular design also allows for adjustments based on data type, resource requirements, and analysis needs, making it suitable for a wide range of applications.

Downloads

Published

2024-05-24

How to Cite

R. Nisha, & G.Dalin. (2024). Adaptive Data Enrichment Pre-Processing System (Adeps) For Duplicate Detection, Outlier Handling, Imputation, And Encoding. Journal of Computational Analysis and Applications (JoCAAA), 33(2), 902–911. Retrieved from https://eudoxuspress.com/index.php/pub/article/view/1510

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.