An innovative Arabic light stemmer developed using a hybrid approach
International Journal of Electrical and Computer Engineering

Abstract
Our study introduces an innovative light stemming tool tailored for Arabic morphology challenges. In conformance with the templatic and concatenative structures, our stemmer utilizes a combination of clitic stripping, lexicon-based, and statistical disambiguation techniques to ensure accurate stemming. To accomplish this, we rely on our clitic rules lexicon to detect all potential combinations of clitics for each input entry. Subsequently, we depend on an extensive lexicon of over 7 million stems to verify the potential stems. Lastly, we employ a statistical model to ascertain the most likely stem based on the sentence's context. Experimental results demonstrate the effectiveness of the proposed stemmer in comparison with existing ones. Using different datasets, our stemmer achieves higher accuracy and F1 scores, highlighting its efficiency in Arabic stemming tasks.
Discover Our Library
Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.
