TELK OMNIKA T elecommunication, Computing,Electr onics and Contr ol V ol. 24, No. 3, June 2026, pp. 840 851 ISSN: 1693-6930, DOI: 10.12928/TELK OMNIKA.v24i3.27447 840 Computational methodologies f or sanad-based hadith analysis: a r e view Abdelilah Mhamedi, Mohammed Mghari, Abdelaaziz El Hibaoui Department of Computer Science, F aculty of Science, Abdelmalek Essa ˆ adi Uni v ersity , T ´ etouan, Morocco Article Inf o Article history: Recei v ed Aug 2, 2025 Re vised Dec 11, 2025 Accepted Jan 30, 2026 K eyw ords: Authenticity Hadith Machine learning Natural language processing Netw ork analysis Ontology Sanad ABSTRA CT Hadith literature, a cornerstone of Islamic tradition, critically depends on the sanad (chain of narrators) for authentication, a process traditionally requiring profound scholarly e xpertise. This paper presents a systematic re vie w of com- putational methodologies designed to enhance and automate sanad analysis, bridging Islamic studies with adv anced articial intelligence (AI). W e cate gorize progress across four k e y domains: automated authenticity classication, sophis- ticated narrator netw ork analysis, te xtual information e xtraction (e.g., named entity recognition), and the de v elopment of specialized datasets and ontologies. Our ndi ngs re v eal a signicant paradigm shift from rule-based systems to ad- v anced machine lear ning (ML) and deep learni ng (DL) techniques. This re vie w synthesizes contrib utions from o v er 50 studies, highlighting critical challenges including data scarci ty , narrator disambiguation, and cross-linguistic resource limitations. W e emphasize the no v elty of this cross-domain synthesis and dis- cuss ho w these intelligent systems can be inte grat ed into digital Islamic archi v es, lo w-resource mobile hadith applications, and embedded natural language pro- cessing (NLP) engines. This w ork charts a course for future research to de v elop more rob ust, scalable, and ethically grounded computational tools, complement- ing traditional hadith scholarship with adv anced engineering solutions. This is an open access article under the CC BY -SA license . Corresponding A uthor: Mohammed Mghari Department of Computer Science, F aculty of Science, Abdelmalek Essa ˆ adi Uni v ersity P .O. Box. 2121, M’Hannech II, T ´ etouan, 93030, Morocco Email: mohammed.mghari@uae.ac.ma 1. INTR ODUCTION Hadith literature, which comprises narrations of the sayings, actions, and appro v als of the Prophet Muhammad peace be upon him (PB UH), is a foundational source of Islamic teachings and la w , second only to the Qur’an [1], [2]. Ev ery hadith is composed of tw o primary parts: the Matn , which is the te xt of the narration itself, and the sanad (also kno wn as Isnad ), which is the chain of narrators responsible for transmitting the Matn [3]. The authenticity of a hadith is of utmost importance, as it directly inuences Islamic jurisprudence, theology , and daily practice. F or centuries, hadith scholars ( muhaddithin ) ha v e emplo yed a rigorous science of hadith criticism ( mustalah al-hadith ) to v erify the reliability of these narrations. A central pillar of this science is the meticulous analysis of the sanad [4], [5]. This process in v olv es a deep e v aluation of the biographical details of each narrator in the chain ( ’ilm al-rijal ) to assess their inte grity , memory , and reliability , as well a s to ensure the chain is continuous and free from hidden defects [6], [7]. The traditional method of sanad analysis is an e xceptionally comple x and labor -intensi v e endea v or , J ournal homepage: http://journal.uad.ac.id/inde x.php/TELK OMNIKA Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control 841 demanding years of specialized study and access to v ast biographical and historical resources [8], [9]. The immense v olum e of hadi th literature, which includes hundreds of thousands of narrations across numerous collections, presents a formidable challenge for comprehensi v e manual analysis. This dif culty is compounded by the intricacies of narrator names, which often appear in multiple v ariations, and the c omple x, often branching chains of transmission [10]. In t he digital era, while man y hadith collections are no w accessible online, the computational tools to analyze them are still de v eloping and remain some what fragmented. There is a pressing need to harness modern technology to ass ist scholars, automate repetiti v e analytical tasks, and unco v er ne w insights from the v ast data embedded within the sanads [11], [12]. The con v er gence of computer science and Islamic studies has catalyzed a vibrant ne w eld dedic ated to applying computational techniques t o hadith analysis [13]-[15]. This paper of fers a systematic re vie w of contemporary sanad-based hadith studies, pro viding a structured o v ervie w of the current state-of-the-art. Re- cent research demonstrates a clear e v olution from early rule-based systems to the adoption of machine learning (ML) and, more recently , adv anced deep learning (DL) approaches [8], [ 1 6] , [17]. By or g anizing recent con- trib utions into four primary cate gories automated classication, narrator netw ork analysis, te xtual component e xtraction, and dataset/ontology construction this re vie w consolidates current kno wledge. W e highlight the pro- gression of methodologies, from foundational e xpert systems to transformer models lik e bidirectional encoder representations from transformers (BER T) [18]-[20], and synthesize their reported ef fecti v eness and limita- tions. This re vie w aims to answer the question: ”What are the current computational methodologies applied to sanad-based hadith analysis, what are their reported performances and limitations, and what are the promising future directions?” Our primary focus is on synthesizing the e xisting tools and theoretical de v elopments that le v erage computational po wer to dissect sanad structures, assess narrator credibility , and enhance scholarly research. W e emphasize the no v elty of this cross-domain synthesis, sho wcasing ho w intelligent systems, can re v olutionize hadith studies. Charting a course for future inte rdisciplinary research will accelerate the de v elop- ment of rob ust and scalable computational solutions that complement, rather than replace, the in v aluable w ork of traditional hadith scholarship. 2. METHOD This systematic literature re vie w w as conducted to identify , e v aluate, and synthesiz e recent resea rch on computational, sanad-based hadith analysis. The process follo wed a structured protocol to ensure a compre- hensi v e and unbiased o v ervie w of the current state of the eld. The search w as performed on major academic databases, including IEEE Xplore, A CM Digital Library , Scopus, Google Scholar , and the preprint repository arXi v . The search w as conducted using a combination of k e yw ords designed to capture the rele v ant literature across computer science and Islamic studies. The primary k e yw ords included: ”hadith sanad analysis”, ”hadith classication”, ”automated hadi th authentication”, ”hadith narrator netw ork”, ”social netw ork analysis hadith” [13], [17], ”sanad graph” [15], [21], ”NLP hadith”, ”sanad e xtraction”, ”narrator name disambiguation”, ”hadith ontology” [16], [22], and ”hadith dataset” [12], [23]-[25]. T o ensure the inclusion of the latest adv ancements, terms such as ”ML”, ”DL”, and ”BER T” [18] were combined with the primary k e yw ords. Studies were included if the y met the follo wing criteria: i) the primary focus w as on the sanad (chain of narrators) of hadith; ii) the study applied computational methods, including b ut not limited to ML, data mining, natural language processing (NLP), or netw ork analysis; iii) the paper w as published in English in a peer -re vie wed journal, conference proceeding, or as a publicly a v ailable technical report or preprint; and i v) the article w as published between 2012 and early 2025 to capture a decade of adv ancements while prioritizing recent w ork. P apers focusi ng e xclusi v ely on the Matn (te xt) wi thout sanad analysis, purely theological or historical studies without a computational component, and articles not a v ailable in full-te xt were e xcluded. The initial search yielded numerous articles, which were then screened by title and abstract, follo wed by a full-te xt re vie w to determine nal eligibility . This process (Figure 1) ensured that the included studies were directly rele v ant to the scope of this re vie w . Computational methodolo gies for sanad-based hadith analysis: a r e vie w (Abdelilah Mhamedi) Evaluation Warning : The document was created with Spire.PDF for Python.
842 ISSN: 1693-6930 I d e n t i c a t i o n o f n e w s t u d i e s v i a d a t a b a s e s a n d r e g i s t e r s I d e n t i c a t i o n S c r e e n i n g I n c l u d e d R e c o rd s i d e n t i e d fro m : I E E E X p l o r e ( n = 5 1 ) G o o g l e S c o l a r ( n = 2 8 7 0 0 ) P u b M e d ( n = 1 2 ) S c o p u s ( n = 3 9 0 ) O t h e r s ( n = 7 0 ) R e c o rd s re m o v e d b e fo re s c re e n i n g : Du p l i c a t e re c o rd s ( n = 7 2 ) R e c o rd s m a rk e d a s i n e l i g i b l e b y a u t o m a t i o n t o o l s ( n = 2 9 0 4 2 ) R e c o rd s re m o v e d fo r o t h e r re a s o n s ( n = 0 ) R e c o rd s s c re e n e d ( n = 1 0 9 ) R e c o rd s e x c l u d e d ( n = 3 ) R e p o rt s s o u g h t fo r re t r i e v a l ( n = 1 0 6 ) R e p o rt s n o t re t ri e v e d ( n = 3 0 ) R e p o rt s a s s e s s e d fo r e l i g i b i l i t y ( n = 7 6 ) R e p o rt s e x c l u d e d : R e a s o n 1 ( n = 1 5 ) R e a s o n 2 ( n = 4 ) N e w s t u d i e s i n c l u d e d i n re v i e w ( n = 5 7 ) R e p o rt s o f n e w i n c l u d e d s t u d i e s ( n = 0 ) Figure 1. PRISMA lo w diagram for the systematic re vie w on computational hadith studies This PRISMA 2020 [26] o w diagram see in Figure 1 systematically maps the process of i dentifying, screening, and including studies for the systematic literature re vie w on computational t ools and t echniques in hadith studies. It details the number of records identied from v arious databases, those remo v ed before screen- ing, and the subsequent stages of record screening, full-te xt re vie w , and ultimate inclusion in the qualitati v e synthesis. The scope of the re vie w w as rigorously dened by a specic Boolean search query architecture (as illustrated in the query design Figure 2). S e a r c h Q u e r y A N D O R O R "network analysis" "graph" N e t w o r k s K e y w o r d s O R "ontology" "semantic" S e m a n t i c s K e y w o r d s O R "dataset" "corpus" D a t a K e y w o r d s O R "LMM" "AI" "ML" "NLP" A I T e c h n o l o g i e s K e y w o r d s C o m p u t e r S c i e n c e K e y w o r d s A N D O R "analysis" "classification" "authentication" "extraction" "disambiguation" T a s k K e y w o r d s O R O R "Hadith" "Sanad" "ISNAD" S A N A D K e y w o r d s O R "Hadith narrator" "narrator" N A R R A T O R K e y w o r d s S c o p e K e y w o r d s I s l a m i c s t u d i e s K e y w o r d s P r i m a r y K e y w o r d s Figure 2. Search query design TELK OMNIKA T elecommun Comput El Control, V ol. 24, No. 3, June 2026: 840–851 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control 843 This architecture utilized primary k e yw ords connected by the operator AND to ensure that the ident i- ed studies operated at the critical intersection of tw o core domains: Islamic studies and computer science. The architecture, designed using the principle of grouping related terms with OR and enforcing connecti on using AND, w as structured as detailed in the T able 1. T able 1. K e yw ord cate gories and boolean connections Cate gory (K e yw ords) Example terms and purpose Boolean connection Sanad/narrator k e yw ords (Islamic studies focus) T erms lik e ”hadith”, ”sanad”, ”hadith narrator”, or ”narrator” were used to ensure focus on the chain of narrators Connected by OR within the group Computational/AI technologies k e yw ords (computer science focus) T erms such as ”LMM”, ”AI”, ”ML”, and ”NLP” captured the required computational methodology Connected by OR within the group T ask k e yw ords T erms lik e ”analysis”, ”classication”, ”authentication”, ”e x- traction”, or ”disambiguation” tar geted specic computational goals Connected by OR within the group Structure k e yw ords T erms lik e ”graph”, ”netw orks”, ”ontology”, ”semantic”, ”dataset”, and ”corpus” captured research related to data struc- ture and formal kno wledge representation Connected by OR within the group Further modications and adaptations were implemented to align with the specic requirements of each search engine, thereby enabling the renement of search results either narro wing or broadening their scope as necessary . In this conte xt, eld-specic constraint s are applied to ensure that k e y terms (e.g., sanad) are restricted to semantically rele v ant metadata elds such as title or abstract, while e xplicitl y e xcluded from non-rele v ant elds such as author , thus enhancing precision and reducing noise in retrie v al. 3. RESUL TS AND DISCUSSION Our re vie w or g anizes the ndings into four principal domains of sanad-based research. Each domain sho ws a clear methodological e v olution from rule-based systems to sophisticated machine and DL approaches (Figure 3) [27]-[57]. This section pro vides a detailed discussion of the ndings within each cate gory , high- lighting k e y techniques, their performance, and pre v ailing challenges, supplemented by summary tables for clarity . Figure 3. T imeline of sanad-based research, illustrating the methodological e v olution in each domain from early rule-based systems to adv anced machine and DL approaches 3.1. A utomated hadith classication f or athenticity assessment Automating the classication of hadith into traditional cate gories such as Sahih (authentic), Hasan (good), Da’if (weak), or Mawdu’ (f abricated) is a pr imary objecti v e of computational hadith studies [48], [6]. Computational methodolo gies for sanad-based hadith analysis: a r e vie w (Abdelilah Mhamedi) Evaluation Warning : The document was created with Spire.PDF for Python.
844 ISSN: 1693-6930 Early attempts in this area relied on fuzzy e xpert systems and rule-based models that sought to codify the principles of hadith criticism. F or e xample, Ghazizadeh et al. [1] proposed a fuzzy e xpert system to determine hadith v alidity by modeling parameters lik e narrator character and sanad continuity , achie ving 94% accurac y on a subset of the Shiite coll ection Al-Ka. While inno v ati v e, these systems were often brittle, collection-specic, and dif cult to scale. The eld has since shifted decisi v ely to w ards ML. Studies ha v e successfully emplo yed supervised learning algorithms where features are deri v ed from the sanad. Aldhlan et al. [8], [9], [11] presented a series of papers using decision trees (DT) and Na ¨ ıv e Bayes (NB), incorporating a missing data detector (MDD) to handle incomplete narrator information, which signicantly boosted their classication accurac y from 50% to 97% on a dataset of 999 hadiths. Other researchers le v eraged heuristic-based systems that assigned weights to narrators based on their rank in classical biographical dictionaries lik e Ibn Hajar’ s T aqrib al-T ahzib , reporting remarkable accuracies of o v er 94% on lar ge samples from Sahih Bukhari and Sunan al-T irmizi [13]. Later w ork e xplored v ector space models (VSM) and learning v ector quantizati on (L VQ) to consider the order of narrators, achie ving 80% precision in distinguishing between Sahih and f abricated hadiths, though performance on Hasan and weak cate gories w as lo wer [16]. More recently , DL models, particularly those based on the transformer architecture lik e BER T , are being e xplored for their abi lity to process ra w te xt and learn comple x representations without manual feature engineering [19], [29]. Abdelaal et al. [23] used n-gram techniques (trigrams) and TF-IDF weighting with classiers lik e linear SVC, achie ving up to 93.69% accurac y . Sentiment analysis has also been applied, where narrator names in the sanad are treated as tok ens to predict authenticity , reaching 80% accurac y with a linear SVC model [21]. A 2022 study e xploring DL for binary hadith classication (aut hentic vs. rejected) found that an AraBER T model achie v ed an accurac y of 91.56% [19]. These results underscore the po wer of con- te xtual embeddings for capturing the nuanced information within narrator chains. The progression of these methodologies is summarized in T able 2. T able 2. Studies concentrated on the cate gorization of hadiths Ref. Approach Preprocessing Classes Language Data source Metric Result [1] Fuzzy system - V alid and not v alid - El-Ka Accurac y 94% [8]-[11] DT , NB - V alid and not v alid Arabic 999 hadiths Accurac y 97% [12] SaaS, SO A - 24 classes - - - - [13] HR Name normaliza- tion Sahih, Hasan, weak Arabic Bukhari, T ir - midhi Accurac y 99% [14], [15] ANLP , ANN, SVM, DT , BC - Sound, weak Arabic - - - [16] SVM, L VQ Remo v e Matn, standardize names Sahih, Hasan, weak, f abricated Arabic 160 hadiths Precision 80% [17], [23] DT , NB, linear SVC, SGD, LR Normalization and tok enization Sahih, Hasan, weak, f abricated Arabic Bukhari and Muslim Accurac y up to 93.75% [45] Doc2V ec Stop w ords and lemmatization Hadith similari- ties Arabic 9 books Accurac y 80% [21] Sentiment analysis T ok enization, v ectorization Sahih/Hasan, weak English Bukhari, Muslim, T irmidhi Accurac y 86% [57] LR, SVM, RF , AraBER T REGEX-based preprocessing Genuine, f ak e Arabic Al-Bukhari, and f ak e hadiths F1-score 99.94% [56] ArabicBER T , NB, DL, CNN-LSTM - Sahih, Hasan, and Da’if and accepted/rejected English full-te xt, sanad-only F1-score 94.90% [54] AraBER T , HistGra- dientBoostingClas- sier TF-IDF Three questions with 0,1 or 2 output Arabic 503 para- graphs Accurac y 93.16%, 96.55% [55] Sanad authentica- tion fuzzy e xpert system - Sahih, Mudallas, Hasan, Matruk, . . . Arabic 5,910 chain of narrators Accurac y 72.2% TELK OMNIKA T elecommun Comput El Control, V ol. 24, No. 3, June 2026: 840–851 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control 845 3.2. Building and analysing hadith narrator netw orks V ie wing the entire corpus of hadith transmission as a lar ge-scale social netw ork has opened ne w a v enues for analysis. In this paradigm, narrators are represented as nodes and the transmission of a hadith from one narrator to another is represented as a directed edge. This frame w ork allo ws for the application of social netw ork analysis (SN A) and graph theory to in v estig ate the structural properties of hadith transmission netw orks [13], [46]. Early conceptual w ork proposed modeling narrator chains using graph-theoretic models lik e di rected ac yclic graphs (D A Gs) to represent the o w of information [18]. Subsequent research focused on creating tools to automatically parse and visualize these chains. Azmi and Badia de v eloped the ”e-Narrator” and ”iT ree” systems, which used parsing techniques and domain-specic grammars (EBNF) to generate transmission trees from ra w hadith te xt, achie ving an 86.7% success rate on a dataset of 90 hadiths [27], [58], [24]. Other prototypes, lik e the chain of hadith narrators visualizer (CHN), pro vided graphical interf aces for students to e xplore narrator connections [19]. More adv anced analyses ha v e applied formal SN A metrics to these narrator graphs. Studies on Sahih Bukhari and Sahih Muslim ha v e used centrality measures (de gree, betweenness, P ageRank) to identify the most inuential narrators in the netw ork, such as Ab u Hurayra, Anas bin Malik, and Az-Zuhri, who acted as major hubs in the propag ation of kno wledge [22], [10], [49]. These ndings quantitati v ely conrm kno wledge pre viously established by classical schol ars. Other researchers ha v e used al gorithms lik e SP ADE to disco v er frequent sequential patterns in narrator chains, re v ealing dominant teacher -student relationships and common transmission pathw ays [48]. A signicant rec ent trend is the use of graph embedding techniques. Mghari et al. [4] introduced Narrator2V ec, a method that learns v ector representations (embeddings) of narrators based on their position in the netw ork. These embeddings can be used for tasks such as predicting missing links in a sanad (link prediction), clustering narrators by generation or scholarly af liation, and identifying narrator similarity . When tested on the all Hadith Corpus, Narrator2V ec achie v ed 94% accurac y in top-10 narrator prediction tasks. These graph-based approaches pro vide in v aluable macroscopic vie ws of the hadith transmission landscape, highlighting its scale-free nature and identifying k e y communities and authorities within it. A summary of these analytical approaches is sho wn in T able 3. T able 3. Exploring analytical approaches for hadith using sanads Ref. Approach Preprocessing Classes Language Data source Metric Result [18] D A G - Sanad representa- tion Arabic - - - [24]-[27] EBNF P arse hadith content Graph, visualiza- tion Arabic Bukhari Accurac y 86.7% [19] Netw ork graph - Graph, visualiza- tion Malay Na w a wi’ s 40 hadiths - - [22] SN A Extract narrators, remo v e du- plicates Narrati v e net- w ork analysis Arabic Bukhari 5 chapters Centrality - [25], [35] Netw ork graph Matn and stop w ords remo v al Graph Malay 18/30 Ha- diths from 9- books/Bukhari Accurac y 60% [10] SN A - Narrati v e net- w ork Arabic Bukhari Centrality - [48] SP ADE T ransform, clean, format Sanad analysis Indonesian Bukhari - - [49] SN A - Narrati v e net- w ork analysis English Muslim SN A measures - [4] W ord embed- dings - Narrators analy- sis Arabic All hadith corpus T op-k accuracies 68-94% 3.3. Identifying and extracting k ey components fr om hadith text The automated e xtraction of k e y information from hadith te xt particularly separating the sanad fr om the Matn and identifying indi vidual narrator names within the sanad is a fundamental task f o r b uilding struc- tured datasets. This is a challenging NLP problem due to the linguistic characteristics of classical Arabic, the Computational methodolo gies for sanad-based hadith analysis: a r e vie w (Abdelilah Mhamedi) Evaluation Warning : The document was created with Spire.PDF for Python.
846 ISSN: 1693-6930 lack of standard punctuation, and the high v ariability in ho w narrators’ names are cited [50]. Early methods used rule-based approaches and nite state transducers (FST). Harrag et al . [2], [3] de v eloped an FST -based entity e xtractor for Sahih Al-Bukhari, b ut it struggled with the sanad entity itself, achie ving a lo w F1-score of 33%. Unsupervised tools lik e the SALAH Project used re gular e xpressions to se gment hadith te xts, achie ving a high ef fecti v eness rate of 97.7% b ut with limitations on handling comple x chains [5]. Other studies emplo yed comple x graph transformations and morphological analysis to e xtract nar - rator relationships, reporting high precision and recall abo v e 97% for se gmentation tasks [6], [7]. The eld has seen signicant i mpro v em ent with the adoption of ML-based named entity recognition (NER). Siddiqui et al. [29] trained classiers lik e Na ¨ ıv e Bayes, DT , and K-nearest neighbors (k-NN) on an annotated corpus to e xtract narrator names, achie ving 90% precision. Najeeb [46], [50] introduced approaches using genetic algorithms (GA) and hidden Mark o v models (HMM) for sanad processing, reaching accuracies of 81.44% and 86% respecti v ely . N-gram models combined with rule-based ltering ha v e also been used to e xtract Arabic Person Names, yielding an F-measure of 70.76% [20]. The most signicant recent adv ancements ha v e come from transformer -based models [29]. These models le v erage conte xtual embeddings to achie v e state-of-the-art performance. F or instance, a study using a semi-supervised BER T model with a feed-forw ard neural netw ork for NER on Indonesian hadith te xts reported an F1-score of 99.27% on a dataset from Bukhari [51]. Another line of research focused on name disam- biguation using w ord sense disambiguation (WSD) techniques combined with a k-NN classier , reporting an F1-score of 99% on a dataset from Sahih Bukhari, demonstrating a po werful method to resolv e narrator ambi- guity [59]. These DL techniques are pro ving indispensable for accurately parsing and structuring hadith te xts at scale, a prerequisite for all other higher -le v el analyses. T able 4 pro vides an o v ervie w of v arious studies in this cate gory . T able 4. Studies e xploring approaches and techniques for named entity e xtraction Ref. Approach Preprocessing Extracted entities Language Data source Metric Result [2], [3] FST -based - Sanad e xtraction Arabic Bukhari F1-score 33% [5] RE - Sanad Arabic/English Bukhari Accurac y 97.7% [6], [7] AMF , FSM, GT - Sanad, narrators Arabic Ibn Hanbal F-score 98% [29] NER, NB, k-NN, DT Normalization, stemming Sanad e xtraction Arabic Bukhari, Musnad Ahmed Precision 90% [46] GA Sanad/Matn manu- ally di vided Sanad e xtraction Arabic Muslim Accurac y 81.44% [50] HMM, Gazetteer Sanad/Matn manu- ally di vided Sanad e xtraction Arabic Muslim Accurac y 86% [20] N-gram - Narrators e xtrac- tion Arabic 6 books F-measure 65.11% [31] POS tags, rule- based Punctuation re- mo v al Sanad, narrators Malay 150/1000 from Bukhari Accurac y 95.3% [36] Rule-based, Sta- tistical T ok enization, Stemming Narrators Arabic Bukhari (prayer) F1-score 76% [37] CRF , FST - - Urdu Bukhari F1-score 92.41% [40] NER, SVM Symbol remo v al, tok enization Narrators e xtrac- tion Indonesian 200 from 9 books F1-score 90% [41], [47] Rule-based, NB Diacritics/punctuation remo v al Sanad e xtraction Arabic/English 6 books Accurac y 92.5% [42] HMM Punctuation re- mo v al, T ok eniza- tion Names e xtraction Indonesian 9 books F1-score 86% [51] NER, BER T - based T ok enization Narrators e xtrac- tion Indonesian 102 from Bukhari F1-score 99.63% 3.4. Construction and de v elopment of sanad datasets and ontologies The foundation for all computational hadith research is the a v ailability of high-quality , structured digital resources. A signicant area of w ork, therefore, in v olv es the construction of comprehensi v e datasets and formal ontologies to support reproducible research and adv anced applications. Early w ork focused on creating structured databases from ra w te xt. Bimba et al. [34] de v eloped a TELK OMNIKA T elecommun Comput El Control, V ol. 24, No. 3, June 2026: 840–851 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control 847 web-based tool with a relational MySQL database to compi le authentic hadith in Malay , linking te xt to reporter information. Other researchers focused on b uilding le xicons using formalisms lik e head-dri v en phrase structure grammar (HPSG) and creating XML-based databases of narrators and their biographical details, often dra wing from classical sources lik e Ibn Hajar’ s w ork [32], [33], [60]. The te xt encoding initiati v e (TEI) standard w as also adopted to normalize and encode hadith te xts, with studies de v eloping trigger -w ord dictionaries to se gment Isnad from Matn, achie ving a high F-measure of 96% [38], [43], [61]. A major contrib ution to the eld has been the de v elopment of lar ge-scale, public datasets. Mahmood et al. [39] created a multilingual repository of hadith content e xtracted from online sources using re gular e xpressions, achie ving 100% accurac y for some books. Other projects ha v e focused on creating specialized corpora, such as a non-authentic hadith (N AH) corpus to train models to detect f abricated narrations [44]. Most signicantly , recent ef forts ha v e produced lar ge-scale datasets for narrator disambiguati o n, such as the AR-sanad 280 K dataset, which contains 279,625 articial sanads. Experiments using this dataset with an AraBER T model achie v ed a 92.9% micro F1 score, demonstrating the v alue of lar ge-scale synthetic data for training rob ust models [52]. Alongside datasets, there is gro wing interest in de v eloping formal ontologies for hadith. Ontol ogies pro vide a machine-readable representation of kno wledge, dening concepts (e.g., narrator , hadith) and their relationships (e.g., ‘narr ates‘ ). These semantic models f acilitate adv anced querying and logical inference. Dalloul and Baraka created an ontology-based Isnad judgment system that could automatically v erify chain continuity based on narrator relationships, achi e ving 81% accurac y [30], [28]. These resources are foundational for b uilding the ne xt generation of intel ligent hadith analysis tools. An o v ervie w of these construction ef forts is pro vided in T able 5. T able 5. Studies on the construction and de v elopment of hadith-specic datasets and ontologies Ref. Approach Data type Language Data source Met ric Result [28], [30] Sanad ontology Arabic 6 Books Accurac y 81% [32], [33] XML, HPSG, Multi-agent XML, dataset, Is- nad tree Arabic Bukhari, Muslim - - [34] MySQL dataset Malay - - - [38], [43] TEI predened XML tags Dataset Arabic Bukhari F-measure 85-96% [39] Re gular e xpres- sion Dataset, XML, CSV Multilingual Muslim and Bukhari F-measure 100% [44] - Dataset, sanad Arabic 6 books - - [52] BER T -based (AraBER T) Dataset, disam- biguation Arabic 6 books F1-score 92.9% [53] RDF , kno wledge graph, link ed open data SemanticHadith ontology , kno wl- edge graph Arabic, Urdu, English Six hadith collections - - 4. CONCLUSION This comprehensi v e re vie w has charted the signicant progress in the application of computat ional methods to sanad-based hadith analysis. The eld is rapidly maturing, mo ving from foundational rule-based systems to sophisticated DL and netw ork science methodologies. The analysis of automated classication tech- niques sho ws a clear performance adv antage for ML o v er static rules, with recent transformer -based models lik e BER T setting ne w benchmarks for authenticity assess ment. The e xploration of narrator netw orks through SN A has pro vided quantitati v e v alidation of classical hadith scholarship and of fers po werful tools for visualiz- ing and understanding the macro-structure of kno wledge transmission in Islam. Concurrently , adv ancements in NLP ha v e been critical in automating the foundational tasks of sanad se gmentation and narrator entity recog- nition, making lar ge-scale analysis feasible. Finally , the de v elopment of lar ge, publicly a v ailable datasets and formal ontologies is pro viding the essential infrastructure to fuel further research and ensure reproducibility . These adv ancements collecti v ely mo v e the body of scientic kno wledge forw ard by demonstrating the immense potential of interdisciplinary collaboration between computer science and Islamic studies. Despite this progress, challenges remain. Man y de v eloped datasets are still limited in scope, and t he models trained on them may not generalize well across dif ferent hadith collections. The problem of narrator name disambiguation remains a signicant hurdle, though r ecent graph-based and BER T -po wered approaches Computational methodolo gies for sanad-based hadith analysis: a r e vie w (Abdelilah Mhamedi) Evaluation Warning : The document was created with Spire.PDF for Python.
848 ISSN: 1693-6930 sho w great promise. The hea vy reliance on re gular e xpressions in some data e xtraction tas k s can be brittle, and the creation of rob ust, adaptable NLP pipelines is an ongoing area of research. Looking forw ard, future w ork should focus on creating lar ger , more di v erse, and standardized benchmark datasets that co v er a wider range of hadith literature, including less-canonical w orks. There is great promise in e xploring more adv anced graph neural netw ork (GNN) architectures for narrator netw ork analysis and le v eraging multimodal models that can analyze both sanad and Matn concurrently to create a more holistic authe n t icity assessment. The continued partnership between computer scientists and traditional hadith scholars is essenti al to ensure that these technological adv ancements are de v eloped responsi bly , rigorously , and in a w ay that genuinely supports and enhances our understanding of the rich heritage of Islamic traditions. FUNDING INFORMA TION Authors state no funding in v olv ed. A UTHOR CONTRIB UTIONS ST A TEMENT This journal uses the C o nt rib utor Roles T axonomy (CRediT) to recognize indi vidual author contrib u- tions, reduce authorship disputes, and f acilitate collaboration. Name of A uthor C M So V a F o I R D O E V i Su P Fu Abdelilah Mhamedi Mohammed Mghari Abdelaaziz El Hibaoui C : C onceptualization I : I n v estig ation V i : V i sualization M : M ethodology R : R esources Su : Su pervision So : So ftw are D : D ata Curation P : P roject Administration V a : V a lidation O : Writing - O riginal Draft Fu : Fu nding Acquisition F o : F o rmal Analysis E : Writing - Re vie w & E diting CONFLICT OF INTEREST ST A TEMENT Authors state no conict of interest. D A T A A V AILABILITY Data a v ailability is not applicable to this paper as no ne w data were created or analyzed in this study . REFERENCES [1] M . Ghazizadeh, M. H. Zahedi, M. Kahani, and B. Minaei Bidgoli, “Fuzzy e xpert system in determining hadith v alidity , in Advances in Computer and Information Sciences and Engineering , T . Sobh, Ed., Springer Netherlands, 2008, pp. 354–359, doi: 10.1007/978- 1-4020-8741-7 64. [2] F . Harrag, E. El-Qa w asmeh, and A. M. Salman Al-Salman, “Extracting named entities from prophetic narration te xts (Hadith), in Communications in Computer and Information Science , v ol. 180 CCIS, no. P AR T 2, J. M. Zain, W . M. B. W an Mohd, and E. El-Qa w asmeh, Eds., Springer Berlin Heidelber g, 2011, pp. 289–297, doi: 10.1007/978-3-642-22191-0 26. [3] F . Harrag, “T e xt mining approach for kno wledge e xtraction in Sah ˆ ıh Al-Bukhari, Computer s in Human Behavior , v ol. 30, pp. 558–566, 2014, doi: 10.1016/j.chb .2013.06.035. [4] M . Mghari, O. Bouras, and A. El Hibaoui, “Narrator2V ec: An Ef cient Narrator Representation in Hadith Literature Using W ord Embedding, Ar abian J ournal for Science and Engineering , v ol. 49, no. 3, pp. 4479–4494, 2024, doi: 10.1007/s13369-023-08224-7. [5] M . Boella, F . R. Romani, A. Al-Raies, C. Solimando, and G. Lancioni, “The SALAH project: Se gmentation and linguistic analysis of had ¯ ıt Arabic te xts, in Lectur e Notes in Computer Science (including subseries Lectur e Notes in Articial Intellig ence and Lectur e Notes in Bioinformatics) , v ol. 7097 LNCS, M. V . M. Salem, K. Shaalan, F . Oroumchian, A. Shak ery , and H. Khelalf a, Eds., Springer Berlin Heidelber g, 2011, pp. 538–549, doi: 10.1007/978-3-642-25631-8 49. [6] J. Makhlouta, F . Zarak et, and H. Hark ous, Arabic enti ty graph e xtracti on using morphology , nite state machines, and graph transformations, in Lectur e Notes in Computer Science (including subseries Lectur e Notes in Arti cial Intellig ence and Lectur e Notes in Bioinformatics ), v ol. 7181 LNCS, no. P AR T 1, A. Gelb ukh, Ed., Springer Berlin Heidelber g, 2012, pp. 297–310, doi: 10.1007/978-3-642-28604-9 25. TELK OMNIKA T elecommun Comput El Control, V ol. 24, No. 3, June 2026: 840–851 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control 849 [7] F . Zarak et and J. Makhlouta, Arabic cross-document NLP for the hadith and biograph y literature, Pr oceedings of the 25th Inter - national Florida Articial Intellig ence Resear c h Society Confer ence , FLAIRS-25, pp. 256–261, 2012. [8] K. A. Aldhlan, A. M. Zeki, and A. M. Zeki, “Datamining and Islamic kno wledge e xtraction: Alhadith as a kno wledge resource, in Pr oceeding of the 3r d International Confer ence on Information and Communication T ec hnolo gy for the Moslem W orld: ICT Connecting Cultur es, ICT4M 2010, IEEE , 2010, p. H-21-H-25, doi: 10.1109/ICT4M.2010.5971934. [9] K. Aldhaln, A. Zeki, A. Zeki, and H. Alreshidi, “Impro ving kno wledge e xtraction of Hadith classier using decision tree algorithm, in Pr oceedings - 2012 International Confer ence on Information Retrie val and Knowledg e Mana g ement , CAMP’12, IEEE, 2012, pp. 148–152, doi: 10.1109/InfRKM.2012.6205024. [10] T . Alam and J. Schneider , “Social Netw ork Analysis of Hadith Narrators from Sahih Bukhari , Pr oceedings of 2020 7th IEEE International Confer ence on Behav iour al and Social Computing , BESC 2020, v ol. abs/2102.02009, 2020, doi: 10.1109/BESC51023.2020.9348299. [11] K. A. Aldhlan, A. M. Zeki , A. M. Zeki, and H. A. Alreshidi, “No v el mechanism to impro v e hadith classier performance, in Pr oceedings - 2012 International Confer ence on Advanced Computer Science Applications and T ec hnolo gies , A CSA T 2012, IEEE, 2013, pp. 512–517, doi: 10.1109/A CSA T .2012.93. [12] K. Bilal and S. Mohsin, “Muhadith: A cloud based distrib uted e xpert system for classication of Ahadith, in Pr oceedings - 10th International Confer ence on F r ontier s of Information T ec hnolo gy , FIT 2012, IEEE, 2012, pp. 73–78, doi: 10.1109/FIT .2012.22. [13] A. M. Azmi, A. M. Alof aidly , ”A no v el method to automatically pass hukm on Hadith, 5th International Confer ence on Ar abic Langua g e Pr ocessing (CIT ALA ’14) , 2014. [14] M. M. Najeeb, “T o w ards Inno v ati v e System for Hadith Isnad Processing, International J ournal of Computer T r ends and T ec hnol- o gy , v ol. 18, no. 6, pp. 257–259, 2014, doi: 10.14445/22312803/ijctt-v18p154. [15] M. M. A. Najeeb, “T o w ards a Deep Leaning-based Approach for Hadith Classication, Eur opean J ournal of Engineering and T ec hnolo gy Resear c h , v ol. 6, no. 3, pp. 9–15, 2021, doi: 10.24018/ejeng.2021.6.3.2378. [16] M. Ghanem, A. Mouloudi, and M. Mourchid, “Classication of Hadiths using L VQ based on VSM Considering W ords Order , International J ournal of Computer Applications , v ol. 148, no. 4, pp. 25–28, 2016, doi: 10.5120/ijca2016911077. [17] H. M. Abdelaal and H. A. Y ouness, “Hadith Classication using Machine Learning T echniques According to its Reliability , Roma- nian J ournal of Information Science and T ec hnolo gy , v ol. 22, no. 3–4, pp. 259–271, 2019. [18] U. Relational and S. I. Hyder , “T o w ards a Databas e Oriented Hadith Research Using Relational, Algorithmic and Data-W arehousing T echniques, The Islamic Cultur e , Quarterly J ournal of Shaikh Zayed Islamic Center for Islamic and Ar abic Studies , v ol. 19, no. March, p. 14, 2008. [19] Z. Shukur , N. F abil, J. Salim, and S. A. Noah, “V isualization of the hadith chain of narrators, in Lectur e Notes in C omputer Science (including subseries Lectur e Notes in Arti cial Intellig ence and Lectur e Notes in Bioinformatics) , H. B. Zaman, P . Robinson, M. Petrou, P . Oli vier , T . K. Shih, S. V elastin, and I. Nystr ¨ om, Eds., Springer , 2011, pp. 340–347, doi: 10.1007/978-3-642-25200-6 32. [20] M. Alha w arat, A domain-based approach to e xtract Arabic person names using N-grams and si mple rules, Asian J ournal of Information T ec hnolo gy , v ol. 14, no. 8, pp. 287–293, 2015, doi: 10.3923/ajit.2015.287.293. [21] F . Haque, A. H. Orth y , and S. Siddique, “Hadith Authenticity Prediction using Sentiment Analysis and Machine Learning, in 14th IEEE International Confer ence on Application of Information and Communication T ec hnolo gies, AICT 2020 - Pr oceedings, Institute of Electrical and Electr onics Engineer s Inc ., 2020, pp. 1–6, doi: 10.1109/AICT50176.2020.9368569. [22] M. A. Ahmad, “T o w ards the Analysis of Narrati v e Netw orks, 2013. Acce ssed: Oct. 25, 2025. [Online]. A v ailable: http://www .aurumahmad.com/assets/pdf/AhmadNarrati v e.pdf [23] H. M. Abdelaal, A. M. Ahmed, W . Ghribi, and H. A. Y . Alansary , “Kno wledge Disco v ery in the Hadith According to the Relia- bility and Memory of the Reporters Using Machine Learning T echniques, IEEE Acces s, v ol. 7, pp. 157741–157755, 2019, doi: 10.1109/A CCESS.2019.2944118. [24] A. Azmi and N. AlBadia, “Mining and visualizing the narration tree of hadiths (Prophetic traditions), Cr oss-Disciplinary Advances in Applied Natur al Langua g e Pr ocessing: Issues and Appr oac hes , pp. 239–257, 2011, doi: 10.4018/978-1-61350-447-5.ch016. [25] N. Alias, N. A. Rahman, N. K. Ismail, Z. M. Nor , and M. N. Alias, Searching Algorithm of Authentic Chain of Narrators’ in Shahih Bukhari Book , no. May . 2016. [26] M. J. P age et al. , “The PRISMA 2020 statement: An updated guideline for reporting systematic re vie ws, BMJ , v ol. 372, p. n71, 2021, doi: 10.1136/bmj.n71. [27] A. M. Azmi and N. B in Badia, “e-Narrator - an application for creating an ontology of Hadiths narration tree semantically and graphically , Ar abian J ournal for Science and Engineering , v ol. 35, no. 2 C, pp. 51–68, 2010. [28] Y . M. Dalloul, An Ontology-Based Approach to Support the Process of Judging Hadith Isnad, in 2012 International Confer ence on Advanced Computer Science Applications and T ec hnolo gies , 2013, pp. 1–108. [29] M. A. Siddiqui, M. E. Saleh, and A. A. Bag ais, “Extraction and V isualization of the Chain of Narrators from Hadiths using Named Entity Recognition and Classication, International J ournal of Computational Linguistics Resear c h , v ol. 5, no. 1, pp. 14–25, 2014. [30] R. S. Baraka and Y . M. Dalloul, “Building Hadith Ontology to Support the Authenticity of Isnad, International J ournal on Islamic Applications in Computer Science And T ec hnolo gy , v ol. 2, no. 1, pp. 25–39, 2014. [31] N. Abd Rahman, N. Alias , N. K. Ismail, Z. Bin M. Nor , and M. N. B. Ali as, An identication of authentic narrator’ s name features in Malay hadith te xts, in ICOS 2015 - 2015 IEEE Confer ence on Open Systems , IEEE, 2016, pp. 79–84, doi: 10.1109/ICOS.2015.7377282. [32] M. Naj eeb, A. Abdelkader , M. Al-Zghoul, and A. Osman, A Le xicon for Hadit h Science Based on a Corpus, International J ournal of Computer Science and Information T ec hnolo gies , v ol. 6, no. 2, pp. 1336–1340, 2015. [33] M. M. Najeeb, “Multi-agent system for hadith processing, International J ournal of Softwar e Engineering and its Applications , v ol. 9, no. 9, pp. 153–166, 2015, doi: 10.14257/ijseia.2015.9.9.13. [34] A. Bimba, M. A. Ismail, N. Idris, S. J aaf ar , and R. Mahmud, ”T o w ards Enhancing the Compilation of Al-Hadith T e xt in Malay , International Pr oceedings of Economics De velopment and Resear c h v ol. 83, no. March. 2015. [35] N. Alias, N. A. Rahman, N. K. Ismail, Z. M. Nor , and M. N. Alias, “Graph-based te xt representati on for Malay translated hadith te xt, in 2016 3r d International Confer ence on Information Retrie val and Knowledg e Mana g ement, CAMP 2016 - Confer ence Pr oceedings, IEEE , 2017, pp. 60–66, doi: 10.1109/INFRKM.2016.7806336. Computational methodolo gies for sanad-based hadith analysis: a r e vie w (Abdelilah Mhamedi) Evaluation Warning : The document was created with Spire.PDF for Python.