Inter national J our nal of Electrical and Computer Engineering (IJECE) V ol. 11, No. 3, June 2021, pp. 2327 2334 ISSN: 2088-8708, DOI: 10.11591/ijece.v11i3.pp2327-2334 r 2327 A T AR: Attention-based LSTM f or Arabizi transliteration Bashar T alafha 1 , Analle Ab uammar 2 , Mahmoud Al-A yy oub 3 1,3 Jordan Uni v ersity of Science and T echnology , Jordan 2 Uni v ersity of Southampton, UK Article Inf o Article history: Recei v ed Apr 4, 2020 Re vised Sep 30, 2020 Accepted Oct 9, 2020 K eyw ords: Arabizi transliteration Attention Benchmark dataset LSTM Seq2seq ABSTRA CT A non-standard romanization of Arabic script, kno wn as Arbizi, is widely used in Arabic online and SMS/chat communities. Ho we v er , since state-of-the-art tools and applications for Arabic NLP e xpects Arabic to be written in Arabic script, handling contents written in Arabizi requires a special attention either by b uilding customized tools or by transliterating them into Arabic script. The la tter approach is the more common one and this w ork presents tw o significant contrib utions in this direction. The first one is to collect and publicly release the first lar ge-scale Arabizi to Arabic script” parallel corpus focusing on the Jordanian dialect and consisting of more than 25 k pairs carefully created and inspected by nati v e speak ers to ensure highest quality . Second, we present A T AR, an A Ttention- based LSTM model for ARabizi translitera- tion. T raining and testing this model on our dataset yields impressi v e accurac y (79%) and BLEU score (88.49). This is an open access article under the CC BY -SA license . Corresponding A uthor: Mahmoud Al-A yyoub Department of Computer Science Jordan Uni v ersity of Science and T echnology Irbid, Jordan Email: maalshbool@just.edu.jo 1. INTR ODUCTION As stated by man y researchers [1–3], social media users e xpress themselv es in w ays dif ferent from standard format. Social media content e xhibit frequent use of informal v ocab ulary , non-standard abbre viation, typos, and man y idiosyncrasies such as repeating le tters for emphasis and writing out non-linguistic content lik e emojis and sound reactions [4–6]. F or se v eral reasons, these issues are more complicated for Arabic content. Examples of these reasons include the pre v alent use of dialectal Arabic (D A) and its gra v e de viations from modern standard Arabic (MSA) [7]. Another reason is the common use of a non-standard romanized w ay of writing Arabic w ords kno wn as Ar abizi . There are man y reasons for the widespread of Arabizi such as the lack of support for Arabic script on some de vices/platforms, the e xistence of some dif ficulties in usi ng Arabic script, the relati v e ease of code-switching between Arabizi and English or French compared with Arabic script. Ev en though Arabizi is not kno wn to all social media users, it is common enough to w arrant studies focusing solely on it [8–17]. F or most state-of-the-art tools and applications for natural language processing (NLP) and inf o r mation retrie v al (IR) of Arabic te xt, the e xpected input is Arabic w ords written in Arabic script [18–20]. Therefore, there is an ob vious need for a system to automatically transliterate content writ ten in Arabizi into Arabic orthograph y [2]. Pre vious studies [2, 9, 21–30] presented tools and resources for this problem. Ho we v er , to the best of our kno wledge, v ery fe w of them [27–30] follo wed deep learning approaches such as Recurrent neural J ournal homepage: http://ijece .iaescor e .com Evaluation Warning : The document was created with Spire.PDF for Python.
2328 r ISSN: 2088-8708 netw orks (RNN) and its e xtens ions such as long short-term memory (LSTM) [31]. The others mostly follo w character -le v el rule-based approaches. In this w ork, we are addressing the problem of Arabizi transliteration by presenting A T AR, an A Ttention- based encoder -decoder model for ARabizi transliteration. This no v el neural netw ork-based approach follo ws the celebrated attention-based encoder -decoder model of [32]. T o e v aluate A T AR, we present a “first of its kind” dataset consisting of 21.5 K w ords from the Jordanian dialect. The rest of this paper is or g anized as: The follo wing section gi v es a high-le v el vie w of the related w ork while section 3 presents our A T AR model and discusses its details. Section 4 discusses the dataset we create and section 5 presents our e v aluation of the proposed model on the collected dataset. Finally , the paper is concluded in section 6. 2. RELA TED W ORK Due to the importance of Arabizi-Arabic script transliteration problem, se v eral companies, such as Google and Microsoft, ha v e in v ested mone y and ef fort into de v eloping tools for this problem. Examples of such tools include: Google T a3reeb, in http://www .google.com/ta3reeb; Microsoft Maren, in https://www .microsoft. com/en-us/do wnload/details.aspx?id=20530; F acebook’ s automatic translation services, in https://engineering. fb .com/ml-applications/e xpanding-automatic-machine-translation-to-more-languages/; Rosette Chat T ransla- tor , in https://www .basistech.com/te xt-analytics/rosette/chat-translator/; Y amli, in https://www .yamli.com/;. Ho w- e v er , these tools are mos tly closed-source, and v ery little is kno wn about the approaches the y follo w or the resources the y emplo y . On the other hand, the ef fort within the Arabic NLP research community to address the Arabizi-Arabic script transliteration problem has been rather sh y . The e xisting resources are limited and are not publicly a v ailable and the proposed approaches do not follo w the ne w and e xciting approaches in the field of sequence learning [33]. Existing w ork on Arabizi transliteration, such as [2, 22–24, 34, 35], follo wed basic approaches that used character -to-character mappings in order to generate lattices of multiple alternati v e w ords. The approach proposed by [36] combines a rule-based model and a discriminati v e model based on conditional random fields (CRF) for transliterating T unisian dialect Arabizi te xts to standard Arabic. A further selection from these w ords is done using language models. As for the dataset the y used, only that of [23] is reported to be publicly a v ailable [2], ho we v er , it is v ery small with only 2.2 K w ord pairs. It w as used in the de v elopment of [24]’ s system in addition to 6,300 Arabic-English proper name pairs from [37]. The reported accurac y of [24]’ s system is 69.4% and it w as later used by [2]. Another interesting ef fort in creating useful resources for the Arabizi transliteration problem is the w ork of Bies et al. [2]. The authors discussed ho w the linguistic data consortium (LDC) coll ected and anno- tated a huge parallel corpus of Arabizi content and its Arabic script counterpart as part of the D ARP A broad operational language translation (BOL T) program (Phase 2). The corpus consisted of more than 408 K w ords and it mainly focused on the Egyptian dialect. Fe w papers [27–30] discussed the use of deep learning for the problem of Arabic trans literation. In [27, 28], the authors claimed to use a standard RNN encoder -decoder model for transliterating sentences written in Algerian dialect, b ut the y did not pro vide an y details of the model. Moreo v er , the dataset the y considered is rather small (1.3 k sentences). In a mode detailed w ork, Y ounes et al. [29] used a standard RNN encoder -decoder model for transliterating w ords in T unisian dialect. Their dataset w as relati v ely big with 45.6 k w ord pairs. In a follo w-up w ork [30], the e xpanded their w ork and discussed ho w to adapt three well-kno wn models in machine translation for the problem of transliterating T unisian dialect. The first one w as a CRF , while the second one w as a Bidirectional RNN with Long short-term memory cells (BLSTM). As for the third one, it w as a BLSTM with CRF decoder . The results sho w the superiority of the latter approach o v er the former tw o approaches. T ransliteration systems ha v e been proposed for man y languages other than Arabic. Ho we v er , such systems are usually designed to transliterate between tw o closely related languages. Examples include the w ork of Musleh et al. [38] on translit erating Urdu to Hindi, the w ork of Nak o v et al. [39] on translite rating Portuguese and Italian to look lik e Spanish and the w ork of Nak o v et al. [40] on transliterating Macedonian to Bulg arian. Int J Elec & Comp Eng, V ol. 11, No. 3, June 2021 : 2327 2334 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 r 2329 3. A T AR: A TTENTION-B ASED LSTM FOR ARABIZI TRANSLITERA TION Ov er the past decade, deep learning approaches ha v e made a ground-breaksaing impact on man y fields such as NLP , image processing, computer vision [41–44]. A particularly interesting and challenging set of problems, kno wn as sequence learning problems, has been hea vily studied by deep learning researchers. A special kind of neural netw orks, kno wn as recurrent neural netw orks (RNN), has been sho wn to perform v ery well for man y sequence learning problems in natural language understanding (NLU) and natural language generation (NLG). Ho we v er , RNN suf fers from some issues lik e the v anishing gradient problem. T o address this problem, Hochreiter and Schmidhuber [31] proposed to equip RNN with memory cells creating what the y called LSTM netw orks. F or sequence-to-sequence problems (lik e the one we ha v e at our hands), a general approach kno wn as the encoder -decoder approach w as found t o be v ery successful. The approach is based on the idea of learning ef ficient representations of the input using an RNN (or LSTM) as an “encoder netw ork” and using another RNN (or LSTM) as a “decoder net w ork” to tak e this feature representation as input, process it to mak e its decision, and produce an output in https://www .quora.com/What-is-an-Encoder -Decoder -in-Deep-Learning. In the rest of this section, we present the details of our attention-based LSTM model for Arabizi translit eration, which we call A T AR. 3.1. Model ar chitectur e our transliteration model is inspired by the attentional sequence-to-sequence (seq2seq) model pro- posed by [32], which is based on the encoder -decoder architecture as sho wn in Figure 1. Figure 1. Illustration of the sequence-to-sequence architecture based on LSTM with the attention mechanism The seq2seq architecture consists of an RNN encoder to learn representations of input sequence X = f x 1 ; x 2 ; : : : ; x n g of v arying length and an RNN decoder , which reads the hidden representation produced by the encoder and generates output sequence Y = f y 1 ; y 2 ; : : : ; y m g of v arying length. The model tak es input from the embedding layer that maps a one-hot encoding v ector of v ocab size, which in our case is the number of letters consisting of dif ferent 47 letters in Ar abizi and 36 in Arabic, as an input and generates a fix ed size dense v ector that represents the semantic features of the input letter . It is w orth mentioning that there is no < UNK > tok en in our case because each w ord (i.e., sequence) is a combination of limited predefined letters. In our architecture, each unit in the encoder and decoder is an LSTM cell which solv es the problem of v anishing gradients with its memory cells [31]. Instead of relying on one thought v ector from the encoder , man y researchers [32, 45] proposed the encoder -decoder architecture with attention. The idea behind the attention mechanism is to link each time step A T AR: Attention-based LSTM for Ar abizi tr ansliter ation (Bashar T alafha) Evaluation Warning : The document was created with Spire.PDF for Python.
2330 r ISSN: 2088-8708 of the decoder with the most “con v enient” time step(s) of the encoder input s equence. This is done by utilizing the idea of global attentional model which tak es all the hidden states of the encoder h s and the current tar get state h t into consideration to calculate the attention score. In this paper , the dot product function is used in order to perform the attention score calculation. scor e ( h T t ; h s ) F ollo wing the pre vious step, the alignment v ector a ts is computed for each state by applying a softmax function to normalize all scores; therefore, a probability distrib ution based on the tar get state will be produced. a ts = exp( scor e h T t ; h s ) P s exp( scor e h T t ; h s ) The decoder then computes a global conte xt v ector c t as a weighted a v erage, based on the alignment v ector a t o v er all the source states. c t = X s a ts h s Therefore, the decoder will tak e the conte xt v ector as an additional input v ector at the ne xt time step s t . 4. D A T ASET W e use Arabizi-Arabic script parallel w ords in order to perform our Arabizi transliteration e xper - iments. Due to the lack of such a v ailable parallel data, we ha v e cra wled only Arabizi data written in the Jordanian dialect from dif ferent resources, such as T witter , F acebook and ASK. These cra wled w ords are re gu- larly used on daily life basis. W e were able to collect 21.5 K unique Arabizi w ords, which were then translated to the Jordanian dialect using only Arabic letters. A group of nati v e speak ers v alidated the parallel data by correcting an y spelling mistak es, remo ving redundant letters and omitting an y unneeded special characters. One of the contrib utions of this w ork is to mak e this “first of its kind” dataset publicly a v ailable. In https://github .com/bashartalafha/Arabizi-T ransliter ation T able 1 sho ws samples of our parallel data. The a v erage length of the collected w ords is about 5 letters per w ord, maximum w ord length of 12 letters and the minimum is 2 letters. T able 1. Examples of our parallel corpus It is w orth mentioning that the same w ord in Arabizi could ha v e dif ferent representations in the Jorda- nian dialect since not all people w ould write it in the same w ay b ut s till the y are all correct. T able 2 sho ws fe w such e xample s. This issue w as f aced by earlier w ork on Arabizi transliteration such as [2, 9] and it is discussed in details therein. As stated by these researchers, such things could penal ize the model and gi v e it lo wer score considering some transliterations are right b ut the reference is dif ferent. T able 2. Examples with dif ferent representations Int J Elec & Comp Eng, V ol. 11, No. 3, June 2021 : 2327 2334 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 r 2331 5. EXPERIMENTS AND EV ALU A TION T o e v aluate the performance of our proposed model, we implement it using T ensorFlo w (W e sel ect T ensorFlo w for its ef ficienc y and ease of use. F or a comparison of dif ferent deep learning frame w orks, the interested readers are directed to [46]), and perform se v eral e xperiments using our dataset. After shuf fling and lo wercasing the data, we use the first 80% of the dataset as the training set, the ne xt 10% as the v alida- tion set and the remaining 10% as the testing. As for the e v aluation metric, we use the tw o most common measures for the Arabizi trans literation task: accurac y and bilingual e v aluation understudy (BLEU) [47]. Fi- nally , to aid the reproducibility of our results, both the dataset and the model are made publicly a v ailable, in https://github .com/bashartalafha/Arabizi-T ransliteration. Using an attentional encoder -decoder sequence-to-sequence translation model, we ha v e to w orry about the man y h yperparameters that can af fect its performance. This issue is so important that complete studies ha v e been dedicat ed for it such as [48], which reported the use of more than 250 K of GPU hours for e xperimentation. F or our w ork, we use the w ork of Britz et al. [48] as well as Ruder’ s blog in http://ruder .io/deep-learning-nlp- best-practices/ and Bro wnlee’ s blog in https://machinelearningmastery .com/configure-encoder -decoder -model- neural-machine-translation/ to guide us in our e xperiments to search for the best v alues for the h yperparameters. The ones that gi v e the best performance are listed in T able 3. F or this configuration, the accurac y is 79% and the BLEU score is 88.49. T able 3. The v alues of our model’ s h yperparameters that gi v es the best performance Hyperparameter V alue LEARNING RA TE 0.001 B A TCH SIZE 64 HIDDEN NODES 256 NUMBER OF LA YERS 1 EMB SIZE 50 EPOCHS 30 LOSS “cate gorical crossentrop y” OPT “adam” BI “No” DR OPOUT 0.2 A T AR does achie v e good results. Ho we v er , it does ha v e its limi tation such as the lack of support for the v arious Arabic dialects. T o address this, one might benefit from e xisting multi-dialect parallel datasets [49–54] or b uild ne w ones (perhaps, by benefiting from unsupervised approaches for dialect translation [7]). Another is- sue that can be addressed before adopting A T AR in real-life scenarios is trying to increase the model’ s accurac y . This can be done by either considering other sequence-to-sequence models, such as F acebook’ s con v olutional sequence-to-sequence model [55] and Google’ s attention-only T ransformer model [56] or by combining it with a neural diacrization model [57, 58]. 6. CONCLUSION In this paper , we addressed the Arabizi transliteration problem. This w ork has tw o significant cont ri- b utions to this problem. The first one is to collect and publicly distrib ute the first lar ge-scale Arabizi-Arabic script parallel corpus focusing on the Jordanian dialect and consisting of more than 25 k pairs carefully cre- ated and inspected by nati v e speak ers to ensure the highest quality . In the s econd contrib ution, we presented one of the first detailed and reproducible ef forts to emplo y the celebrat ed attention-based seq2seq model for Arabizi transliteration. The presented model, which we called A T AR, performed v ery well in the e xperiments we conducted. It reached an impressi v e le v el with an accurac y of 79% and a BLEU score of 88.49. Future directions include e xperimenting with other sequence-to-sequence models, such as F acebook’ s con v olutional sequence-to-sequence model and Google’ s attention-only T ransformer model. W e are also thinking of w ays to e xpand our w ork to other Arabic dialects. Finally , we will e xplore the generation of more accurate MSA te xt from the transliteration by looking into combining our model with a neural diacrization model. A T AR: Attention-based LSTM for Ar abizi tr ansliter ation (Bashar T alafha) Evaluation Warning : The document was created with Spire.PDF for Python.
2332 r ISSN: 2088-8708 A CKNO WLEDGEMENT The authors w ould lik e to thank the Deanship of Research at the Jordan Uni v ersity of Science and T echnology for supporting this w ork (through Grant #20190180). The authors w ould also lik e to thank Nesreen Al-Qasem and Areen Ban y Salim for their ef forts in creating the dataset. REFERENCES [1] N. Y . Habash, “Introduction to arabic natural language processing, Synthesis Lect ures on Human Lan- guage T echnologies, v ol. 3, no. 1, pp. 1–187, 2010. [2] A. Bies, Z. Song, M. Maamouri, S. Grimes, H. Lee, J. Wr ight, S. Strassel, N. Habash, R. Eskander , and O. Rambo w , “T ransliteration of arabizi into arabic orthograph y: De v el op i ng a parallel annotated arabizi- arabic script sms/chat corpus, Proceedings of the EMNLP 2014 W orkshop on Arabic Natural Language Processing (ANLP), 2014, pp. 93–103. [3] W . A. Hussien, Y . M. T ashtoush, M. Al-A yyoub, and M. N. Al-Kabi, Are emoticons good enough to train emotion classifiers of arabic tweets?” 7th International C on f erence on Computer Science and Information T echnology (CSIT), 2016, pp. 1–6. [4] A. I. Alharbi and M. Lee, “Combining character and w ord embeddings for af fect in arabic informal social media microblogs, International Conference on Applications of Natural Language to Information Systems. Springer , 2020, pp. 213–224. [5] W . Hussien, M. Al-A yyoub, Y . T ashtoush, and M. Al-Kabi, “On the use of emojis to train emotion classi- fiers, arXi v preprint arXi v:1902.08906, 2019. [6] K. A. Kw aik, S. Chatzik yriakidis, S. Dobnik, M. Saad, and R. Johansson, An arabic tweets sentiment analysis dataset (atsad) using distant supervision and self training, Proceedings of the 4th W orkshop on Open-Source Arabic Corpora and Processing T ools, with a Shared T ask on Of fensi v e Language Detection, 2020, pp. 1–8. [7] W . F arhan, B. T alafha, A. Ab uammar , R. Jaikat, M. Al-A yyoub, A. B. T arakji, and A. T oma, “Unsu- pervised dialectal neural machine translation, Information Processing and Management, v ol. 57, no. 3, 2020. [8] J. May , Y . Benjira, and A. Echihabi, An arabizi-english social medi a statistical machine translation sys- tem, Proceedings of the 11th Conference of the Association for Machine T ranslation in the Americas, 2014, pp. 329–341. [9] M. v an der W ees, A. Bisazza, and C. Monz, A simple b ut ef fecti v e approach to impro v e arabizi-to- english statistical machine translation, Proceedings of the 2nd W orkshop on Noisy User -generated T e xt (WNUT), 2016, pp. 43–50. [10] R. M. Duw airi, M. Al f aqeh, M. W ardat, and A. Alrabadi, “Sentiment analysis for arabizi te xt, 2016 7th International Conference on Information and Communication Systems (ICICS), 2016, pp. 127–132. [11] A. M. Abd Al-Aziz, M. Gheith, and A. S. E. Ahmed, “T o w ard b uilding arabizi sentiment le xicon based on orthographic v ariants identification, The 2nd International Conference on Arabic Computational Lin- guistics (A CLing), 2016. [12] I. Guellil, A. Adeel, F . Azouaou, F . Benali, A.-e. Hachani, and A. Hussain, Arabizi sentiment analysis based on transliteration and automatic corpus annotation, Proceedings of the 9th w orkshop on computa- tional approaches to subjecti vity , sentiment and social media Analysis, 2018, pp. 335–341. [13] I. Guellil, F . Azouaou, F . Benali, A. E. Hachani, and M. Mendoza, “The role of transliteration in the pro- cess of arabizi translation/sentiment analysis, Recent Adv ances in NLP: The Case of Arabic Language, Springer , 2020, pp. 101–128. [14] T . T obaili, M. Fernandez, H. Alani, S. Sharafeddine, H. Hajj, and G. Gla v as, “Senzi: A sentiment anal- ysis le xicon for the latinised arabic (arabizi), Proceedings of the International Conference on Recent Adv ances in Natural Language Processing (RANLP 2019), 2019, pp. 1203–1211. [15] F . Aqlan, X. F an, A. Alqwbani, and A. Al-Mansoub, Arabic–chinese neural machine translation: Ro- manized arabic as subw ord unit for arabic-sourced translation, IEEE Acce ss, v ol. 7, pp. 133 122–133 135, 2019. [16] E. Gugliotta and M. Dinarelli, “T arc: Incrementally and semi-automatically collecting a tunisian arabish corpus, arXi v preprint arXi v:2003.09520, 2020. [17] M. Alkhatib and K. Shaalan, “Boosting arabic named entity recognition transliteration with deep learn- Int J Elec & Comp Eng, V ol. 11, No. 3, June 2021 : 2327 2334 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 r 2333 ing, The Thirty-Third International Flairs Conference, 2020. [18] I. El Bazi and N. Laachfoubi, Arabic named entity recognition using deep learning approach, Interna- tional Journal of Electrical and Computer Engineering, v ol. 9, no. 3, pp. 2025-2032, 2019. [19] H. G. Hassan, H. M. A. Bakr , and B. E. Ziedan, A frame w ork for arabic concept-le v el sentiment analysis using senticnet, International Journal of Electrical and Computer Engineering, v ol. 8, no. 5, pp. 4015- 4022, 2018. [20] M. A. Ahmed, R. A. Hasan, A. H. Ali, and M. A. Mohammed, “The classification of the modern ara- bic poetry using m achine learning, TELK OMNIKA T elecommunication, Computing, Electronics and Control, v ol. 17, no. 5, pp. 2667–2674, 2019. [21] K. Shaalan, H. Bakr , and I. Ziedan, “T ransferring e gypt ian colloquial dialect into modern standard arabic, International Conference on Recent Adv ances in Natural Language Processing (RANLP–2007), Boro v ets, Bulg aria, 2007, pp. 525–529. [22] A. Chalabi and H. Ger ges, “Romanized arabic transliteration, Proceedings of the Second W orkshop on Adv ances in T e xt Input Methods, 2012, pp. 89–96. [23] K. Darwish, Arabizi detection and con v ersion to arabic, arXi v preprint arXi v:1306.6755, 2013. [24] M. Al-Badrashin y , R. Eskander , N. Habash, and O. Rambo w , Automatic transliteration of romanized di- alectal arabic, Proceedings of the Eighteenth Conference on Computational Natural Language Learning, 2014, pp. 30–38. [25] R. Eskander , M. Al-Badrashi n y , N. Habash, and O. Rambo w , “F oreign w ords and the automatic process- ing of arabic social media t e xt written in roman script, Proceedings of The First W orkshop on Computa- tional Approaches to Code Switching, 2014, pp. 1–12. [26] N. Altrabsheh, M. El-Masri, and H. Mansour , “Proposed no v el algorithm for transliterating arabic terms into arabizi, Research in Computer Science, 2017. [27] I. Guellil, F . Azouaou, M. Abbas, and S. F atiha, Arabizi transliteration of algerian arabic dialect into modern standard arabic, Social MT 2017/First w orkshop on social media and user generated content machine translation, 2017. [28] I. Guellil, F . Azouaou, and M. Abbas, “Neural vs statistical t ranslation of algerian arabic dialect writ- ten with arabizi and arabic letter , The 31st P acific Asia Conference on Language, Information and Compu- tation P A CLIC, v ol. 31, 2017, p. 2017. [29] J. Y ounes, E. Souissi, H. Achour , and A. Fer chichi, A sequence-to-sequence based approach for the double transliteration of tunisian dialect, Procedia computer science, v ol. 142, pp. 238–245, 2018. [30] J. Y ounes, H. Achour , E. Souissi, and A. Ferchichi, “Romanized tunisian dialect transli teration using sequence labelling techniques, Journal of King Saud Uni v ersity-Computer and Information Sciences, 2020. [31] S. Hochreiter and J. Schmidhuber , “Long short-term memory , Neural computation, v ol. 9, no. 8, pp. 1735–1780, 1997. [32] M.-T . Luong, H. Pham, and C. D. Manning, “Ef fecti v e approaches to attention-based neural machine translation, arXi v preprint arXi v:1508.04025, 2015. [33] M. Al-A yyoub, A. Nuseir , K. Alsmearat, Y . Jararweh, and B. Gupta, “Deep learning for arabic nlp: A surv e y , Journal of computational science, v ol. 26, pp. 522–531, 2018. [34] G. Lancioni, E. Gugliotta, and V . Pettinari , “Lahajat: A rule-based con v erter of standard arabic le xical databases into spok en arabic forms, 2016 4th IEEE International Colloquium on Information Science and T echnology (CiSt), 2016, pp. 395–399. [35] I. Guellil, F . Azouaou, F . Benali, A.-E. Hachani, and H. Saadane, “Hybrid approach for transliteration of algerian arabizi: a primary study , arXi v preprint arXi v:1808.03437, 2018. [36] A. Masmoudi, M. E. Khmekhem, M. Khrouf, and L. H. Belguith, “T ransliteration of arabizi into arabic script for tunisian dialect, A CM T rans. Asian Lo w-Resour . Lang. Inf. Process., v ol. 19, no. 2, No v . 2019, doi: 10.1145/3364319. [37] T . Buckw alter , “Buckw alter arabic morphological analyzer v ersion 2.0, W eb Do wnload, 2004. [38] A. Musl eh, N. Durrani, I. T emnik o v a, P . Nak o v , S. V ogel, and O. Alsaad, “Enabling medical translation for lo w-resource languages, International Conference on Intelligent T e xt Processing and Computational Linguistics. Springer , 2016, pp. 3–16. [39] P . Nak o v and H. T . Ng, “Impro ving statistical machine translation for a resource-poor language using related resource-rich languages, Journal of Artificial Intelligence Research, v ol. 44, pp. 179–222, 2012. A T AR: Attention-based LSTM for Ar abizi tr ansliter ation (Bashar T alafha) Evaluation Warning : The document was created with Spire.PDF for Python.
2334 r ISSN: 2088-8708 [40] P . Nak o v and J. T iedemann, “Combining w ord-le v el and character -le v el models for machine translation between closely-related languages, Proceedings of the 50th Annual Meeting of the Association for Com- putational Linguistics: Short P apers-V olume 2. Association for Computational Linguistics, 2012, pp. 301–305. [41] Y . LeCun, Y . Bengio, and G. Hinton, “Deep learning, nature, v ol. 521, no. 7553, pp. 436–444, 2015. [42] I. Goodfello w , Y . Bengio, and A. Courville, ”Deep learning, MIT press, 2016. [43] J. P atterson and A. Gibson, ”Deep learning: A practitioner’ s approach, O’Reilly Media, Inc. ”, 2017. [44] G. Al-Bdour , R. Al-Qurran, M. Al-A yyoub, and A. Shatna wi, A detailed comparati v e study of open source deep learning frame w orks, arXi v preprint arXi v:1903.00102, 2019. [45] D. Bahdanau, K. Cho, and Y . Bengio, “Neural machine translation by jointly learning to align and trans- late, arXi v preprint arXi v:1409.0473, 2014. [46] G. Al -Bdour , R. Al-Qurran, M. Al-A yyoub, and A. Shatna wi, “Benchmarking open source deep learning frame w orks, Submitted to the International Journal of Electrical and Computer Engineering (IJECE), 2020. [47] K. P apineni, S. Rouk os, T . W ard, and W .-J. Zhu, “Bleu: a method for automatic e v aluation of machine translation, Proceedings of the 40th annual meeting on association for computational linguistics. Asso- ciation for Computational Linguistics, 2002, pp. 311–318. [48] D. Britz, A. Goldie, M.-T . Luong, and Q. Le, “Massi v e e xploration of neural machine translation archi- tectures, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, pp. 1442–1451. [49] H. Bouamor , S. Hassan, and N. Habash, “The madar shared task on arabic fine-grained dialect identifica- tion, Proceedings of the F ourth Arabic Natural Language Processing W orkshop, 2019, pp. 199–207. [50] B. T alafha, A. F adel, M. Al-A yyoub, Y . Jararweh, A.-S. Mohammad, and P . Juola, “T eam just at the madar shared task on arabic fine-grained dialect identification, Proceedings of the F ourth Arabic Natural Language Processing W orkshop, 2019, pp. 285–289. [51] B. T alafha, W . F arhan, A. Altakrouri, and H. Al-Natsheh, “Ma wdoo3 ai at madar shared task: Arabic tweet dialect identification, Proceedings of the F ourth Arabic Natural Language Processing W orkshop, 2019, pp. 239–243. [52] A. Rag ab, H. Seela wi, M. Samir , A. Mattar , H. Al-Bataineh, M. Zaghloul, A. Mustaf a, B. T alafha, A. A. Freihat, and H. Al-Natsheh, “Ma wdoo3 ai at madar shared task: Arabic fine-grained dialect identification with ensemble learning, Proceedings of the F ourth Arabic Natural Language Processing W orkshop, 2019, pp. 244–248. [53] C. Zhang, H. Bouamor , M. Abdul-Mageed, and N. Habash, “The shared task on nuanced rabic dialect identification (nadi), Proceedings of the Fifth Arabic Natural Language Processing W orkshop, 2020. [54] B. T alafha, M. Ali, M. E. Za’ ter , H. Seela wi, I. T uf f aha, M. Samir , W . F arhan, and H. T . Al-Natsheh, “Multi-dialect arabic bert for country-le v el dialect identification, arXi v preprint arXi v:2007.05612, 2020. [55] J. Gehring, M. Auli, D. Grangier , D. Y arats, and Y . N. Dauphin, “Con v olutional sequence to sequence learning, Proceedings of the 34th International Conference on Machine Learning-V olume 70. JMLR. or g, 2017, pp. 1243–1252. [56] A. V asw ani, N. Shazeer , N. P armar , J. Uszk oreit, L. Jones, A. N. Gomez, Ł. Kaiser , and I. Polosukhin, Attention is all you need, Adv ances in Neural Information Processing Systems, 2017, pp. 5998–6008. [57] A. F adel, I. T uf f aha, M. Al-A yyoub et al., Arabic te xt diacritization using deep neural netw orks, 2019 2nd International Conference on Computer Applications and Information Security (ICCAIS), 2019, pp. 1–7. [58] A. F adel, I. T uf f aha, B. Al-Ja w arneh, and M. Al-A yyoub, “Neural arabic te xt diacritizat ion: State of the art results and a no v el approach for machine translation, arXi v preprint arXi v:1911.03531, 2019. Int J Elec & Comp Eng, V ol. 11, No. 3, June 2021 : 2327 2334 Evaluation Warning : The document was created with Spire.PDF for Python.