Inter national J our nal of Electrical and Computer Engineering (IJECE) V ol. 7, No. 5, October 2017, pp. 2496 2501 ISSN: 2088-8708 2496       I ns t it u t e  o f  A d v a nce d  Eng ine e r i ng  a nd  S cie nce   w     w     w       i                       l       c       m     Signboard T ext T ranslator: A Guide to T ourist Ilaiah Ka v ati, G Kiran K umar , Sarika K esagani and K Srini v asa Rao Dept. of CSE, MLR Institute of T echnology , Hyderabad-43, India Article Inf o Article history: Recei v ed: Mar 11, 2017 Re vised: Jun 5, 2017 Accepted: Jun 20, 2017 K eyw ord: Signboard OCR T esseract T ranslator W eb Application ABSTRA CT The tra v elers f ace troubles in understanding the signboards which are written in local lan- guage. The tra v elers can rely on smart phone for tra v eling purposes. Smart phones become most popular in recent years in terms of mark et v alue and the number of useful applications to the users. This w ork intends to b uild up a web application that can recognize the English content present on signboard pictures captured using a sm art phone, translate the content from English to T elugu, and display the translated T elugu te xt back onto the screen of the phone. Experiments ha v e been conducted on v arious signboard pictures and the outcomes demonstrate the viability of the proposed approach. Copyright c 2017 Institute of Advanced Engineering and Science . All rights r eserved. Corresponding A uthor: Ilaiah Ka v ati MLR Institute of T echnology Dundig al, Hyderabad - 43 +91 - 9848916272 ka v ati089@gmail.com 1. INTR ODUCTION W e li v e in a society where we speak with indi viduals and data frame w orks through dif fering media. Lar ge v olumes of data is represented in natural scenes. Signs are all around in our li v es. A sign is an article that proposes the nearness of an actuality . It can be a sho wn structure bearing letters or symbols, used to recognize something or publicize a b usiness. It can also be a posted notification bearing a w arning, safety advisory , or command, etc. Signs are great e xamples in re gular habitats that ha v e high data content. The y mak e our li v es simpler when we can read and follo w them, b ut the y posture issues or much threat when we are most certainly not. F or instance, a tra v eler w on’ t ha v e the capacity t o understand a sign in local language that indica tes notices or risks [1]. In this w ork, we concentrate on recognizing te xt on signs. Sign board te xt translator frame w ork uses a smart phone to capture the signboard image containing signs, recognizes it, and translates it into user specified language. Automatic detection and recognition of content from common scenes are requirements for a signboard te xt translator . The main challenge lies in the assortment of te xt: it can v ary in te xt style, size, font, and orientation. There may also blur in the te xt and can be block ed by other objects in the signboard. As signs e xist in three-dimensional space, content on signs can be misshaped by inclination, tilt, and shape of objects on which the y are found. A numerous OCR frame w orks function v ery well on high qualit y images; ho we v er , the y perform poor for signboard images because of lo w quality nat ure of them. The proposed approach uses a T esseract OCR engine to e xtract the te xt from the acquired image [2]. W e ha v e successfully applied the proposed approach to a English-T elugu sign te xt translation frame w ork, which can recognize English signs caught from a camera, and translate the recognized te xt into T elugu. T o our kno wledge, English to T elugu signboard translator has not been in v estig ated pre viously . Rest of the paper is or g anized as follo ws: The present adv ancements on te xt translators will be e xplored in Section II. Section III e xplores the proposed approach. The outcomes are discussed in Section IV . Finally , Section V concludes our w ork. 2. RELA TED W ORK This section e xplores about current signboard te xt translator applications de v eloped for dif ferent languages using Optical Character Recognition (OCR). The w ork is accomplished for English to Chinese, Japanese to English, J ournal Homepage: http://iaesjournal.com/online/inde x.php/IJECE       I ns t it u t e  o f  A d v a nce d  Eng ine e r i ng  a nd  S cie nce   w     w     w       i                       l       c       m      ,  DOI:  10.11591/ijece.v7i5.pp2496-2501 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 2497 Figure 1. Block Diagram of the proposed approach Chinese to English, Hindi to T amil, English to Spanish, Malay to English/Arabic and Hindi to English on v arious v ersatile stages. F ollo wing are some articles e xist in the literature for te xt translation in dif ferent languages. Ma et al. b uilt up an Android-based te xt translation application that can percei v e the content caught by a cellular telephone camera, translate the content, and sho w the interpretation result back onto the screen of the cell telephone. This application empo wers the clients to get te xt translation as simplicity as a b utton click. The camera catches the content and returns the interpr eted result continuously . This frame w ork contains automatic te xt detection, optical character recognition, te xt correction, and te xt translation [3]. W atanabe et al. report an application which interprets Japanese writings in a scene into English. The application is intended to k eep running on a cell phone with a camera. It recognizes Japanese characters detected by an installed camera of a cellular phone and translate them into English [4]. In another approach, Chen et al. presents an approach to deal with automatic recognition of signs from natural scenes, and its application to a signboard translator . The proposed approach uses multiresolution and multiscale edge detection, color analysis in a hierarch y for sign identification. The strate gy can essentially enhance te xt detection rate and OCR e xactness. Rather than utilizing binary data for OCR, the y retrie v e features from the picture strai ghtforw ardly . The y use a local intensity normalization technique to adequately handle lighting v ariations, trailed by a Gabor transform to get local features and lastly a Linear Discriminate Analysis (LD A) strate gy for feature selection. The y applied this methodology in de v eloping a Chinese sign translation frame w ork, whic h can consequently recognize Chinese signs tak en from a camera, and translate the te xt into English [5]. In another approach, Authors proposed a design for English to Spanish translation of the signs present on JPEG pictures tak en with a cellular phone camera. The y distinguish the te xt using the frequenc y data of the DCT coef ficients, binarize it uti lizing a clustering algorithm, recognize it by the utilization of an OCR algorithm and phonetically translate it from English to Spanish. The outcome is a helpful, basic, moderate and rob ust frame w ork that can deal with issues lik e lighting v ariations, focusing, l o w resolution and etc, in a short time [1]. Muthalib et al. e xplores a frame w ork for te xt translation. It uses mobile technology and the frame w ork is e xamined in which it included 30 global under graduates of Malays ia. Quickly this study found that clients accept the Mobile-T ranslator well. The mobile translator particularly for use in Malaysia, in which the local language is Malay that can be translated into either English or Arabic [6]. Mishra and P atv ardhan presents a greatly on-interest, quick and easy to understand Android Application A TMA. A TMA stands for Android T ra v el Mate Application. This application is helpful for local T ourists and T ra v- elers who ha v e Android Smart telephones. It empo wers T ra v elers and T ourists to ef fecti v ely catch the local nation language Books pages, signboards, flags and lodging menus and so on. The implicit OCR changes o v er the te xt translator in the caught image into Unicode content arrangement. It li k e wise gi v es interpretation of fice with the goal that T ourists can translate Nati v e Language Unicode content into their o wn particular nation language. This Appli- cation has adv anced features such as cop y paste, share and search for tra v el related inquiries lik e museums, spots, hotels, books, restaurants and so forth [7]. 3. PR OPOSED METHOD The proposed system capture the signboard images through a Camera phone. The captured image is then submitted to a central serv er . The image is preproces sed to e xtract the te xt and is reccognized. The recognized te xt is then con v erted to user specified language and del i v ered appropriately . The proposed approach is sho wn in Figure 1. The proposed approach follo ws these steps: ( i ) Sign board image acquisition, ( ii ) Extraction and recognition of te xt from the captured image, ( iii ) T ranslation of the e xtracted te xt into user specified language. 3.1. Sign board image acquisition W e use a mobile camera to capt u r e the signboard image. W e restrict the distance of the camera from the signboard should be a maximum of 10 meters to get good quality images. As the images are captured using a mobile camera, the y may be generally noisy and contains comple x backgrounds. Hence, the te xtual image should be pre-processed to mak e it clear so that the system can easily e xtract the te xt and recognizes it. Signboar d T e xt T r anslator: A Guide to T ourist (Ilaiah Kavati) Evaluation Warning : The document was created with Spire.PDF for Python.
2498 ISSN: 2088-8708 Figure 2. Optical Character Recognition 3.2. T ext Extraction and Recognition Extraction and recognition of te xt from the captured image is called Optical Character Recognition (OCR). In this paper , we use a T esseract OCR engine to e xtract the te xt from the acquired image [2]. The T esseract is an open- source OCR engine that w as de v eloped at HP between 1984 and 1994. The Optical Character Recognition (OCR) process is di vided into the follo wing phases: Preprocessing [8], Se gmentation, Feature Extraction and classification. Figure 2 sho ws the phases of OCR. 3.2.1. Pr epr ocessing The pre-processing in v olv es: Image contrast enhancement [9], Noise remo v al, Binarization and smooth- ing. T o enhance the contrast, the images are subjected to adapti v e histogram equalization. The adapti v e histogram equalization increases t he contrast of the image and reduces the possible imperfections in the image. The enhanced images are further processed with median filter to suppress the noise content. The resulting image is subjected to lo- cally adapti v e threshold method to obtain a binary image. Finally smoothing which implies both filling and thinning. Filling el iminates small breaks, g aps and holes in the digitized characters. Thinning reduces the width of the line. The normalization is applied to obtain characters of uniform size, slant and rotation. 3.2.2. Segmentation Se gmentation is a process which helps to di vide each character from a w ord present in a gi v en image/page. The objecti v e of the se gmentation is to e xtract each character from the te xt present in the thinned image [10]. After performing se gmentation, the characters of the string will be separated and it will be used for further processing. A horizontal projection profile technique [11] is used to se gment the te xt area from the thinned image. This technique scans the imput image horizontally to find the first and last black pix els in a line. Once these pix els are found, the area in between these pix els represents the line that may contain one or more character . 3.2.3. F eatur e Extraction and Classification In this stage, the features of the char acters that are crucial for classifying them at recognition stage are e xtracted. The classification is the process of identifying each character and assigning to it the correct character class. The classification approaches are tw o t ypes. 1) Decision-theoretic methods: The principal approaches to decision-theoretic recognition are minimum distance classifiers, statistical classifiers and neural netw orks. 2) Struc- tural Methods: Measures of similarity based on relationships between structural components may be formulat ed by using grammatical concepts. Suppose that we ha v e tw o dif ferent character classes which can be generated by the tw o grammars G1 and G2, respecti v ely . Gi v en an unkno wn character , we say that it is more s imilar to the first class if it may be generated by the grammar G1, b ut not by G2. 3.2.4. P ostpr ocessing This step performs grouping of symbols and errors handling. The process of performing the association of symbols into strings is commonly referred to as grouping. It is based on the symbols location in the document. Symbols that are found to be suf ficiently close are grouped together . 3.3. T ext T ranslator This section e xplains the process of translating the e xtracted te xt from the signboard image into user speci- fied language. T o do this, we propose a translator module that maintain a dictionary of w arnings or hazards in English (such as Poison, Do Not Drink, No Smoking, Danger , No Entry , Flam mable Area, etc. which are generally appear on signboards) along with their meaning in user specified language i.e., T elugu. Whene v er the system recognizes a te xt IJECE V ol. 7, No. 5, October 2017: 2496 2501 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 2499 ( a) ( b ) Figure 3. (a) Sample signboard image in English, (b) T ranslated te xt from English to T elugu in English language, it will v erify the dictionary and the corresponding T elugu te xt is retrie v ed which is displayed o v er the cell phone display . This approach can be easily adopted to other launguages by feeding the required w ords and their meaning in the required language. 4. RESUL TS This section e xplain the results of our w ork. T o measure the performance of the proposed system, we used tw o measures namely: Precision and Accurac y . Pr ecision: Precision is also called character recognition rate. Precision is the number of characters correctly recognized to the total number of characters tested. This is gi v en in Equation 1, where n c is the number of correctly recognized characters and N c is total number of characters. P r ecision = n c N c (1) Accuracy : Accurac y is the number of signboard images correctly recognized to the total number of signboard images tested. This is gi v en in Equation 2, where n w is the number of correctly recognized characters and N w is total number of characters. Accur acy = n w N w (2) W e e xperimented our approach on 20 signboard images. Samples of some signboard images are sho wn in Figure 3, Figure 4 and Figure 5. The proposed system achie v es a precis ion of 93.6 %. In other w ords, our system is able to e xtract the characters from the signboards and recognizes them ef fecti v ely . Further , the accurac y of the system depends on the character recognition rate i:e:; Precision. As the system sho ws an acceptable le v el of character recognition, our system achie v es enhanced accurac y . W e e xperimented our approach on 20 signboard images and for 18 instances the y are correctly translated into the specified T elugu language. In other w ords, the system achie v es an accurac y of 90%. The sample signboard images tak en and the obtained results are sho wn in Figure 3, Figure 4 and Figure 5. W e also e xperimented this approach for the te xt with dif ferent fonts and the system recognizes/ translated them correctly . 5. CONCLUSION This system is de v eloped to mak e it easier for tourists from the states of T elang ana and Andhrapradesh to translate the signboards written in English which could be dif f ficult for them to understand during their trips. W e proposed a system to translate signboard images captured using a mobile phone from English to T elugu. The system is able to translate the te xt in dif ferent color , light, te xt on dark background, image with blurring, etc. The system Signboar d T e xt T r anslator: A Guide to T ourist (Ilaiah Kavati) Evaluation Warning : The document was created with Spire.PDF for Python.
2500 ISSN: 2088-8708 ( a) ( b) Figure 4. (a) Sample signboard image in English, (b) T ranslated te xt from English to T elugu ( a) ( b) Figure 5. (a) Sample signboard image in English, (b) T ranslated te xt from English to T elugu achie v ed a Precision i:e:; character recognition rate of 93.6% and translation accurac y of 90%. This sho ws the ef fi- cienc y of the T essaract OCR engine for te xt e xtraction and recognition. Our system sho ws some characteristics that mak e it interesting and deserv e for further research. Future w ork in v olv es: 1. automatic recognition and translation of handwritten English signboards into T elugu that in v olv es language processors; 2. Usage of more accurate OCR for increased performance; 3. De v eloping a Mobile application. REFERENCES [1] A. Canedo-Rodr ´ ıguez, S. Kim, J. H. Kim, and Y . Blanco-Fern ´ andez, “English to spanish translation of sign- board images from mobile phone camera, in IEEE Southeastcon 2009 . IEEE, 2009, pp. 356–361. [2] R. Smith, An o v ervie w of the tesseract ocr engine, in Pr oc. Ninth Int. Confer ence on Document Analysis and Reco gnition (ICD AR) , 2007, pp. 629–633. [3] D. Ma, Q. Lin, and T . Zhang, “Mobile ca mera based te xt detection and translation, Stanfor d Univer sity , No v , 2000. [4] Y . W atanabe, K. Sono, K. Y ok omizo, and Y . Okada, “T ranslation camera on mobile phone. in icme , v ol. 3, 2003, pp. 177–180. [5] X. Chen, J. Y ang, J. Zhang, and A. W aibel, Automatic detection and recognition of signs from natural scenes, IEEE T r ansactions on ima g e pr ocessing , v ol. 13, no. 1, pp. 87–99, 2004. IJECE V ol. 7, No. 5, October 2017: 2496 2501 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 2501 [6] A. A. Muthalib, A. Abdelsatar , M. Salameh, and J. Dalle, “Ma k i ng learning ubiquitous with mobile translator using optical character recognition (ocr), in International Confer ence on Advanced Computer Science and Information System (ICA CSIS) . IEEE, 2011, pp. 233–236. [7] N. Mishra and C. P atv ardhan, Atma: Android tra v el mate application, International J ournal of Computer Applications , v ol. 50, no. 16, 2012. [8] X. T ian, B. Y u, and J. Sun, A preprocessing and analyzing method of images in pdf documents for mathemat- ical e xpression retrie v al, Indonesian J ournal of Electrical Engineering and Computer Science , v ol. 12, no. 6, pp. 4579–4588, 2014. [9] S. Sahu, “Comparati v e analysis of image enhancement techniques for ultrasound li v er image, International J ournal of Electrical and Computer Engineering , v ol. 2, no. 6, p. 792, 2012. [10] D. Aghlmandi and K. F aez, Automatic se gmentation of glottal space from video images based on mathematical morphology and the hough transform, International J ournal of Electrical and Computer Engineering , v ol. 2, no. 4, p. 463, 2012. [11] H. Almohri, J. S. Gray , and H. Alnajjar , A r eal-time DSP-based opti cal c har acter r eco gnition system for isolated ar abic c har acter s using the TI TMS320C6416T . Uni v ersity of Hartford, 2007. BIOGRAPHIES OF A UTHORS Dr Ilaiah Ka v ati is a Professor of Computer Science a nd Engineering Department at MLR Insti tute of T echnology , Hyderabad, India. He obtained his Ph.D. in Computer Science from the Uni v ersity of Hyderabad (Central Uni v ersity), India in 2016. He recie v ed his B.T ech and M.T ech de grees from Ja w aharlal Nehru T echnological Uni v ersity , Hyderabad in 2004 and 2010 respecti v ely . He is ha ving 13 years of teaching and research e xperience. His current researches are in the fields of image processing, biometrics, security , internet of things and big data analytics. He is a life member of ISTE professsional society . He published more than twenty publications in international conferences and journals. Dr G Kiran K umar is a Professor of Computer Science and Engineering Department at MLR Insti- tute of T echnology , Hyderabad, India. He obtained his Ph.D. in Computer Science & Engineering from Nag arjuna Uni v ersity , Andhra Pradesh, India in 2015. He recie v ed his B.T ech and M.T ech de grees from Osmania Uni v ersity , Hyderabad. He is ha ving more than 15 years of teaching and research e xperience. He is a life member of ISTE. His current researches are in the fields of Spatial Data Mining, image processing, big data and cloud computing. Sarika K esagani is an Assistant Professor of Computer Science and Engine ering Department at MLR Institute of T echnology , Hyderabad, India. She re cie v ed her B.T ech and M.T ech de grees from Ja w aharlal Nehru T echnological Uni v ersity , Hyderabad in 2012 and 2016 respecti v e ly . Her current researches are in the fields of image processing and Internet of Things. Dr K Srini v as Rao is the Professor of Computer Science and Engineering Department at MLR In- stitute of T echnology , Hyderabad, India. He obtained his Ph.D. in Computer Science & Engineering from Anna Uni v ersity , T amilnadu, India. He recie v ed his B.T ech and M.T ech de grees from Osma- nia Uni v ersity , Hyderabad. He is ha ving more than 20 years of teaching and research e xperience. His current researches are in the fields of Data Mining, image processing and big data analytics. Signboar d T e xt T r anslator: A Guide to T ourist (Ilaiah Kavati) Evaluation Warning : The document was created with Spire.PDF for Python.