Automated Development of a Grammatical Dictionary for Georgian Dialects

Authors

  • Liana L Lortkipanidze Ivane Javakhishvili Tbilisi State University, Georgia
  • Anna R Chutkerashvili Archil Eliashvili Institute of Control Systems of the Georgian Technical University, Georgia

DOI:

https://doi.org/10.33422/ejest.v8i1.1553

Keywords:

Acquisition of Lexicon, Agglutinative Languages, Language Modelling, Lemmatization Rules, Morphological Analysis

Abstract

This paper presents an automated system for compiling grammatical dictionaries of the Georgian language and its dialects. Unlike traditional dictionaries, grammatical dictionaries include not only base word forms but also complete paradigms, offering detailed morphological and syntactic information. This is particularly crucial for agglutinative-inflectional languages such as Georgian, where word forms vary significantly depending on context. The system applies a dictionary-based approach to expand lexical resources by identifying words with shared grammatical markers and integrates an innovative lemmatization algorithm capable of processing unknown words, automatically generating their base forms and paradigms. The methodology builds upon prior research in dialectal lexicography and syntactic annotation within Georgian corpora, while introducing comparative insights from similar linguistic technologies applied to other agglutinative languages. The developed system demonstrated high efficiency in automating the creation of grammatical dictionaries. Testing on Georgian literary corpora revealed that only 2% of non-dictionary word forms required manual correction post-lemmatization. The affix-based algorithm significantly outperformed traditional suffix-only methods, particularly in handling complex morphological structures. These results confirm the system's effectiveness in expanding lexical resources and highlight its adaptability for other Kartvelian languages. The study emphasizes the value of integrating linguistic theory with computational approaches to address challenges in morphological processing and lexicon development, offering both theoretical contributions and practical applications in language technology.

Downloads

Published

2025-07-01

How to Cite

Lortkipanidze, L. L., & Chutkerashvili, A. R. (2025). Automated Development of a Grammatical Dictionary for Georgian Dialects. European Journal of Engineering Science and Technology, 8(1), 13–25. https://doi.org/10.33422/ejest.v8i1.1553

Issue

Section

Articles