A Novel Framework for Recognizing Characters in Historical Gurmukhi Manuscripts

Authors

  • Harpal Singh Research Scholar, Department of Computer Science, Punjabi University, Patiala, India
  • Simpel Rani Professor, YCOE, Punjabi University Guru Kashi Campus, Talwandi Sabo, Punjab, India
  • Gurpreet Singh Lehal Senior Project Consultant, IIIT Hyderabad, Telangana, India

Keywords:

Historical Gurmukhi Manuscripts, Feature Extraction, Classifiers, Feature Fusion, Machine Learning

Abstract

Ancient manuscripts in the Gurmukhi script serve as invaluable repositories of cultural and historical knowledge, offering insights into the linguistic and artistic heritage of a bygone era. However, the development of efficient Handwritten Text Recognition (HTR) systems for these manuscripts has been hindered by significant challenges. In the past, attempts were made to recognize isolated characters in historical Gurmukhi manuscripts; however, this approach is insufficient for developing a comprehensive HTR system, as these characters mostly appear alongsidevowels. Notably, previous research has not addressed the recognition of characters with vowels. A large dataset is required for the accurate recognition of these characters but due to the scarcity of such datasets, previous attempts to recognize these compounds have not been made. To bridge this gap, we undertook a pioneering effort and meticulously curated 33,223 samples of 156 frequently used character-vowel compounds. To address the complexities of recognition, we extracted various features from these compounds in the collected dataset. Principal Component Analysis (PCA) was applied to the feature set to reduce training time and enhance accuracy. A range of machine learning-based classifiers, such as Support Vector Machine (SVM), Random Forest, k-NN, and Naive Boosting were used for recognition. Additionally, a feature fusion method was used to improve accuracy by combining features from all methods except Open Endpoint features, and PCA was then applied. A notable recognition accuracy of 85.24% was achieved on the features obtained through the feature fusion process with the SVM (RBF) classifier, marking a significant advancement in the recognition of historical Gurmukhi manuscripts and providing a robust foundation for future work in the digital preservation and analysis of literary heritage of India.             

Downloads

Published

2024-09-22

How to Cite

Harpal Singh, Simpel Rani, & Gurpreet Singh Lehal. (2024). A Novel Framework for Recognizing Characters in Historical Gurmukhi Manuscripts. Journal of Computational Analysis and Applications (JoCAAA), 33(08), 58–71. Retrieved from https://eudoxuspress.com/index.php/pub/article/view/1236

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.