Evaluating the Amarkosha to Generate Computational Model for Sanskrit Vocabulary and Sanskrit Word Bank

Authors

  • Kaustav Sanyal Department of Computer Science & Engineering, Sarala Birla University, Ranchi, India
  • Pankaj K. Goswami Department of Computer Science & Engineering, Sarala Birla University, Ranchi, India
  • Neelima Pathak Department of Humanities & Linguistics, Sarala Birla University, Ranchi, India

Keywords:

Amarkosha, Natural Language Processing, Word Bank, Clustering, K-Means Clustering, Louvain Community Detection, Sanskrit.

Abstract

Amarkosha is considered to be one of the most complete word banks ever generated for the Sanskrit language. It has the listing of almost 10,000 words along with their morphological construct, a list of paryayavachi words (synonyms), and their gender study or linganushasanam. The scripture is divided into three sections listing 27 clusters of words. The last cluster of the last section is completely dedicated to defining the genders of the words. The scripture itself is so composed that a computational model for the Sanskrit vocabulary can easily be generated from it. As natural language processing (NLP) for any language needs a good word bank along with all its characteristics and behavioral aspects, in this paper we have made an effort to cluster the Sanskrit vocabulary and construct the computational model for the Sanskrit word bank. The clustering of the words is made by two standard methods, k-means clustering and the Louvain community detection method. In a comparative study of both methods, we have observed the Louvain method to be more efficient in clustering the Sanskrit vocabulary as it provides the output that aligns with the original construct and clusters of the Amarkosha itself. Louvain method gives the output of 24 distinct communities for the words, whereas k-means clustering gives 36 clusters as output. This gives 88% accuracy for the Louvain community detection method and 67% accuracy for k-means clustering.

Downloads

Published

2024-09-24

How to Cite

Kaustav Sanyal, Pankaj K. Goswami, & Neelima Pathak. (2024). Evaluating the Amarkosha to Generate Computational Model for Sanskrit Vocabulary and Sanskrit Word Bank. Journal of Computational Analysis and Applications (JoCAAA), 33(07), 1138–1144. Retrieved from https://eudoxuspress.com/index.php/pub/article/view/1183

Issue

Section

Articles

Similar Articles

<< < 20 21 22 23 24 25 26 > >> 

You may also start an advanced similarity search for this article.