Inter national J our nal of Electrical and Computer Engineering (IJECE) V ol. 11, No. 2, April 2021, pp. 1613 1626 ISSN: 2088-8708, DOI: 10.11591/ijece.v11i2.pp1613-1626 r 1613 MTVRep: A mo vie and TV sho w r eputation system based on fine-grained sentiment and semantic analysis Abdessamad Benlahbib, El Habib Nfaoui Computer Science Department, LISA C Laboratory , F aculty of Sciences Dhar EL Mehraz (F .S.D.M), Sidi Mohamed Ben Abdellah Uni v ersity , Fez, Morocco Article Inf o Article history: Recei v ed Jul 21, 2020 Re vised Sep 8, 2020 Accepted Sep 28, 2020 K eyw ords: Decision making Fine-grained sentiment analysis Natural language processing Reputation generation T e xt mining ABSTRA CT Customer re vie ws are a v aluable source of information from which we can e xtract v ery useful data about dif ferent online shopping e xperiences. F or trendy items (products, mo vies, TV sho ws, hotels, services . . . ), the number of a v ailable users and customers’ opinions could easily surpass thousands. Therefore, online reputation systems could aid potential customers in making the right decision (b uying, renting, booking . . . ) by automatically mining te xtual re vie ws and their ratings. This paper presents MTVRep, a mo vie and TV sho w reputation system that incorporates fine-grained opinion min- ing and semantic analysis to generate and visualize reputation to w ard mo vies and TV sho ws. Dif ferently from pre vious studies on reputation generation that treat the task of sentiment analys is as a binary classification problem (positi v e, ne g ati v e), the proposed system identifies the sentiment strength during the pha se of sentiment classification by using fine-grained sentiment analysis to separate mo vie and TV sho w re vie ws into v e discrete classes: strongly ne g ati v e, weakly ne g ati v e, neutral, weakly positi v e and strongly positi v e. Besides, it emplo ys embeddings from language models (ELMo) representations to e xtract semantic relations between re vie ws. The contrib ution of this paper is threefold. First, mo vie and TV sho w re vie ws are separated into v e groups based on their senti ment orientation. Second, a custom score is computed for each opinion group. Finally , a numerical reputation v alue is produced to w ard the tar get mo vie or TV sho w . The ef ficac y of the proposed sys tem is illustrated by conducting se v eral e xperiments on a real-w orld mo vie and TV sho w dataset. This is an open access article under the CC BY -SA license . Corresponding A uthor: Abdessamad Benlahbib Computer Science Department, LISA C Laboratory F aculty of Sciences Dhar EL Mehraz (F .S.D.M), Sidi Mohamed Ben Abdellah Uni v ersity Fez B.P . 1796 Fes-Atlas, 30003 Morocco Email: abdessamad.benlahbib@usmba.ac.ma 1. INTR ODUCTION The e xponential gro wth of W eb 2.0 has dramatically impacted the e v olution of e-commerce plat forms [1–4]. On the one hand, some recent statistics sho w that 72% of customers will not tak e action until the y read re vie ws, and only 6% of consumers don’ t trust customer re vie ws at all, on the other hand, the number of user - generated re vie ws attached to an online entity could easily e xceed thousands [5, 6]. Thus, a potential customer doesn’ t ha v e the time or ef fort to e xamine all the re vie ws manually in order to mak e a decision to w ard it [7, 8]. Little research has been conducted in mining customer and user re vie ws with re g ard to feature -based summarization and reputation generation for the purpose of supporting customer decision making process in E-commerce (b uying, renting, booking . . . ). Ov er the last tw o decades, a fe w opinion summarizer systems J ournal homepage: http://ijece .iaescor e .com Evaluation Warning : The document was created with Spire.PDF for Python.
1614 r ISSN: 2088-8708 ha v e been proposed to produce a summary for product re vie ws [9], mo vie re vie ws [3], hotel re vie ws [1] and local service re vie ws [10]. Backing to reputation generation task, to the best of our kno wledge, there are v ery fe w reputation systems that ha v e been proposed to comput e a single reputation v alue to w ard dif ferent entities based on fusing and mining user and customer re vie ws e xpressed in natural language [11–15]. Y an et al. [11] applied opinion mining and fusion techniques on product re vie ws. Benlahbib and Nf aoui [12] used K-Means clustering algorithm on mo vie re vie ws. The same authors [13] incorporated semantic and sentiment analysis to generate a single reputation v alue from user and customer re vie ws e xpressed in natural language (English). An important issue that w as ne glected in the past research on reputation generation is identifying the sentiment strength during the phase of sentiment classification and opinion fusion. In f act, e xisting w orks ha v e only focused on classifying re vie ws into positi v e or ne g ati v e before generating a single reputation v alue, disre g arding the sentiment strength. In this paper , we propose MTVRep, a mo vie and TV sho w reputation system that appli es fine-grained opinion mining to separate re vie ws into v e opinion groups: strongly ne g ati v e, weakly ne g ati v e, neutral, weakly positi v e and strongly positi v e. Then, it computes a custom score for each group based on the acquired statistics of each group, i.e., the number of re vie ws in each group, the sum of their ratings and the sum of their semantic similarity (ELMo and cosine metric). Finally , a numerical reputation v alue is produced to w ard the tar get mo vie or TV sho w using the weighted arithmetic mean. In this manner , this study addressed the follo wing research question: with the combination of fine- grained opinion mining and semantic analysis, can the proposed reputation system of fer better results in terms of reputation generation than the pre vious reputation systems (consider only semantic relations)?. The remain- der of this paper is or g anized as follo ws: Related w orks are pro vided in Section 2. Section 3 illustrates the w ork-flo w of the reputa tion system. Section 4 presents all the e xperimental results and discusses its compara- ti v e performance, finally conclusions are dra wn in Section 5. 2. LITERA TURE REVIEW This section describes and e xamines pre vious research w ork done in the area of natural language processing (NLP) techniques for decision making in E-commerce and fine-grained sentiment analysis. 2.1. Fine-grained sentiment analysis on the 5-class stanf ord sentiment tr eebank (SST -5) dataset Xu et al. [16] proposed Emo2V ec which are w ord-le v el representations that encode emotional se- mantics into fix ed-sized, real-v alued v ectors. Mu et al. [17] presented a simple post-processing operation that renders w ord representations e v en stronger by eliminating the top principal components of all w ords. Socher et al. [18] introduced recursi v e neural tensor netw orks and the stanford sentiment treebank. W ang et al. [19] proposed RNN-Capsule, a capsule model based on recurrent neural netw ork (RNN) for sentiment analysis. Y ang [20] presented RNFs, a ne w class of con v olution filters based on recurrent neural netw orks. McCann et al. [21] introduced an approach for transferring kno wledge from an encoder pretrained on machine translation to a v ariety of do wnstream natural language processing (NLP) tasks. Munikar et al. [22] used the pretrained BER T [23] model and fine-tuned it for the fine-grained sentiment classification task on the SST -5 dataset. T able 1 summarizes the latest w orks on fine-grained opinion mining applied to stanford sentiment treebank dataset (SST -5). T able 1. State-of-the-art results for sentiment analysis on SST -5 fine-grained classification Method Authors and Y ear Accurac y % BCN+Suf fix BiLSTM-T ied+CoV e Brahma (2018) [24] 56.2 BER T lar ge Munikar et al. (2019) [22] 55.5 BCN+ELMo Peters et al. (2018) [25] 54.7 BCN+Char+CoV e McCann et al. (2017) [21] 53.7 CNN-RNF-LSTM Y ang (2018) [20] 53.4 RNN-Capsule W ang et al. (2018) [19] 49.3 SWEM-concat Shen et al. (2018) [26] 46.1 RNTN Socher et al. (2013) [18] 45.7 GR U-RNN-W ORD2VEC Mu et al. (2017) [17] 45.02 GloV e+Emo2V ec Xu et al. (2018) [16] 43.6 Emo2V ec Xu et al. (2018) [16] 41.6 Int J Elec & Comp Eng, V ol. 11, No. 2, April 2021 : 1613 1626 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 r 1615 2.2. NLP techniques f or decision making in E-commer ce It has been well recognized that user re vie ws attached to an entity (mo vie, product, etc...) contain v aluable information about it. Recently , fe w approaches ha v e been proposed to help potential customers during decision-making proces s in E-comm erce websites by automatically mining user and customer re vie ws. The most popular approaches are feature-based summarization and reputation generation. Feature-based summarization approaches aim to produce a feature-based summary for a tar get entit y as sho wn in Figure 1. The first feature-based summarizer system w as proposed by Hu and Liu [9] in which the y applied association rule mining to e xtract product features, and the y used a set of seed adjecti v es to identify the semantic orientation for opinion w ords. Zhuang et al. [3] b uilt a multi-kno wledge based system that aims to generate a feature-based summary for online mo vie re vie ws. Blair -Goldensohn et al. [10] presented a feature- based summarizer for local service re vie ws. Kang ale et al. [27] proposed a feature -based summarize system for product re vie ws that produces a rating as well as re vie w summary of each product feature as sho wn in Figure 1. Figure 1. Feature-based summary [27] Reputation generation systems ha v e interest in pro viding potential customers with suf ficient i n f orma- tion to w ard the tar get entity (product, mo vie, hotel . . . ) to help them mak e the right decision to w ard it (b uying, renting, booking . . . ). Currently , a fe w reputation systems ha v e bee n proposed to tackle the task of reputa- tion generation using opinion mining techniques on user and customer re vie ws e xpressed in natural language. Y an et al. [11] were the first to propose a reputation system that combines opinion mining and opinion fu- sion techniques for the purpose of producing a single re p ut ation v alue to w ard v arious products. The system firstly eliminates irrele v ant re vie ws [28], then, the remaining re vie ws are grouped into dif ferent sets based on their semantic relations (latent semantic analysis and cosine metric), and finally , a single numerical reputation v alue i s produced. Benlahbib and Nf aoui [12] used K-Means clustering algorithm to group similar mo vie re- vie ws into the same cluster based on their semantic r elations before generating a reputation v alue. The same authors [13] designed and b uilt a h ybrid reputation system that firstly combines Na ¨ ıv e Bayes and linear sup- port v ector machine (SVM) to separate user and customer re vie ws into positi v e and ne g ati v e (document le v el sentiment analysis), then, it groups them into dif ferent sets based on semantic relations, and finally , a single reputation v alue is computed using weighted arithmetic mean. 3. PR OPOSED SYSTEM 3.1. System o v er view The proposed approach consists mainly on four steps: W e collect mo vie and TV sho w re vie ws from IMDb in https://www .imdb .com/, website using the web scraping tool ScrapeStorm in https://www .scrapestorm.com/, then, we preprocess them. MTVRep: A mo vie and TV show r eputation system based on... (Abdessamad Benlahbib) Evaluation Warning : The document was created with Spire.PDF for Python.
1616 r ISSN: 2088-8708 W e train Multinomial Na ¨ ıv e Bayes model on the 5-class stanford sentiment treebank (SST -5) dataset in order to perform fine- grained sentiment analysis. The model classifies the collected re vie ws to v e opinion groups: strongly ne g ati v e, weakly ne g ati v e, neutral, weakly positi v e and strongly positi v e. F or each opinion group, we acquire the sum of user ratings and the sum of re vie ws semantic similarity . The semantic simila rity between tw o re vie ws is computed as the cosine between their deep conte xtualized w ord embeddings (ELMo). These acquired statistics are used to compute a custom score for each opinion group. W e compute the mo vie or TV sho w numerical reputation v alue based on the opinion groups’ scores by applying the weighted arithmetic mean. Figure 2 illustrates the w ork-flo w of the reputation system (MTVRep). Figure 2. Reputation system pipeline Int J Elec & Comp Eng, V ol. 11, No. 2, April 2021 : 1613 1626 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 r 1617 3.2. Fine-grained sentiment analysis W e classify the collecte d re vie ws into v e opinion groups based on their sentiment intensities by ap- plying the Multinomial Na ¨ ıv e Bayes model trained on the 5-class stanford sentiment treebank (SST -5) dataset. The reasons behind using Multinomial Na ¨ ıv e Bayes model are discussed in section 4.2. 3.3. Opinion gr oups custom scor es After separating mo vie and TV sho w re vie ws into v e opinion groups: strongly ne g ati v e, weakly ne g ati v e, neutral, weakly positi v e and strongly positi v e, we compute a custom score for each opinion group based on the sum of their ratings and the sum of their re vie ws semantic similarity . The statistics of opinion groups are acquired by applying algorithm 1. Algorithm 1: Opinion groups statistics acquisition Define : G pol ar ity = f r pol ar it y 1 ; r pol ar ity 2 ; : : : ; r pol ar ity n g : the opinion group that contains re vie ws which hold the sentiment orientation pol ar ity . R pol ar ity = f r r pol ar ity 1 ; r r pol ar it y 2 ; : : : ; r r pol ar i ty n g : the set of ratings attached to G pol ar it y re vie ws. S S pol ar ity : the sum of semantic similarity for G pol ar it y re vie ws. S R pol ar ity : the sum of ratings for G pol ar it y re vie ws. N R pol ar ity : the number of re vie ws in G pol ar ity . E LM o ( r pol ar it y i ) : ELMo embeddings for re vie w i from G pol ar ity . cos ( E LM o ( r pol ar i ty i ) ; E LM o ( r pol ar ity j )) : the cosine similarity between ELMo embeddings for re vie w i and j from G pol ar ity . Input : Opinion groups, their lengths and their user ratings: G pol ar it y , N R pol ar ity and R pol ar ity . Output: Opinion groups’ statistics: S S pol ar it y and S R pol ar ity 1 pol ar it y   [ str ong l y neg ativ e; w eak l y neg ativ e; neutr al ; w eak l y positiv e; str ong l y positiv e ] 2 / * After applying the trained model on the collected movie and TV show reviews, we separate them into five opinion groups: strongly negative, weakly negative, neutral, weakly positive and strongly positive. For each opinion group, we acquire the sum of their reviews semantic similarity (cosine metric and ELMo embeddings) and the sum of their ratings * / 3 f or i in pol ar ity do 4 S S i   0 5 S R i   0 6 f or j   1 to N R i do 7 S S i   S S i + cos ( E LM o ( r i 1 ) ; E LM o ( r i j )) 8 S R i   S R i + r r i j 9 end f or 10 end f or By applying algorithm 1, we retrie v e for each group, the sum of their ratings and the sum of t heir semantic similarity . W e propose formula (1) to compute a custom score for each opinion group. C S ( G pol ar it y ) = maxR S S polar ity N R polar ity + S R polar ity N R polar ity 2 (1) F ormula (1) could also be written as follo ws: MTVRep: A mo vie and TV show r eputation system based on... (Abdessamad Benlahbib) Evaluation Warning : The document was created with Spire.PDF for Python.
1618 r ISSN: 2088-8708 C S ( G pol ar ity ) = maxR S S pol ar ity + S R pol ar it y 2 N R pol ar it y (2) W e denote: maxR : Highest v alue of user ratings (5 or 10) depending on the range of ratings (1 to 5 or 1 to 10). S S pol ar ity : Sum of similarity for re vie ws contained in opinion group G pol ar ity . S R pol ar i ty : Sum of user ratings in opinion group G pol ar it y . N R pol ar ity : Number of re vie ws contained in opinion group G pol ar it y . The custom score of each opinion group ranges between 1 and 5 or 1 and 10 depending on the range of user rating v alues. Since the cosine metric returns v alues in the r ange of [0,1], the a v erage of the sum of semantic similarity for an opinion group is also between 0 and 1, therefore, we multiply the a v erage of the sum of semantic similarity by 5 or 10 ( maxR ) to get a numerical v alue between 0 and 5 or 0 and 10, then, we add this v alue to the a v erage of sum of ratings and we di vide them by 2. 3.4. Reputation generation W e propose formula (3) (weighted arithmetic mean) to compute the mo vie or TV sho w reputation v alue. R ep ( E ) = P pol ar ity C S ( G pol ar it y ) N R pol ar ity P pol ar it y N R pol ar it y (3) C S ( G pol ar it y ) is the custom score for opinion group G pol ar ity computed by applying formula (1) or (2). The mo vie or TV sho w reputation v alue has v alues in the range of [1, 5] or [1, 10] depending on the range of user ratings. 4. EXPERIMENT AL EV ALU A TION 4.1. Dataset gathering W e collect mo vie and TV sho w re vie ws and their numerical ratings from IMDb web site using the web scraping tool ScrapeStorm. Figure 3 depicts the structure of IMDb user re vie ws. Figure 3. IMDb user re vie ws structure The first ten datasets contain m o vi e re vie ws and the remaining ten datasets contain TV sho w re vie ws. T able 2 sho ws the statistical information of the collected datasets. T able 2. Statistical information of the collected datasets Mo vies TV sho ws T otal Number of re vie ws 1000 1000 2000 Number of entities 10 10 20 Int J Elec & Comp Eng, V ol. 11, No. 2, April 2021 : 1613 1626 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 r 1619 After collecting the re vie ws, we replace the missing rating v alues with the a v erage of the ratings, then, we lo wercase them and we remo v e punctuation marks and numbers. 4.2. T raining phase and fine-grained opinion mining W e train the Multinomial Na ¨ ıv e Bayes model with SST -5 dataset. The training set contains 1092 strongly ne g ati v e re vie ws, 2218 weakly ne g ati v e re vie ws, 1624 neutral re vie ws, 2322 weakly positi v e re vie ws and 1288 strongly positi v e re vie ws. The test set contains 279 strongly ne g ati v e re vie ws, 633 weakly ne g ati v e re vie ws, 389 neutral re vie ws, 510 weakly positi v e re vie ws and 399 strongly positi v e re vie ws. Figure 4 depicts the distrib ution of training and test samples o v er the v e classes. Figure 4. Number of samples in SST -5 training and test set Before feeding the data to the classifier for training, we preprocess them by remo ving punct u a tion marks, numbers and whitespaces, then, we lo wercase and lemmatize them. After preprocessing the data, we must choose which classifier we will apply and which features we will use. Since deep learning models require substantial computing po wer (High-performance CPUs, GPUs and RAM), we decided to w ork with one of MTVRep: A mo vie and TV show r eputation system based on... (Abdessamad Benlahbib) Evaluation Warning : The document was created with Spire.PDF for Python.
1620 r ISSN: 2088-8708 the four models: Random F orest, Logistic Re gression, Multinomial Na ¨ ıv e Bayes and Linear support v ector machine (SVM). The last tw o classifiers (Na ¨ ıv e Bayes and SVM) ha v e been recognized as the most popu- lar supervised machine learning algorithms for polarity classification [29]. F or features selection, we ha v e tried man y combinations: unigrams, bigrams, trigrams, tf-idf unigrams, tf-idf bigrams and tf-idf trigrams. W e discarded some popular models such as: w ord2v ec and doc2v ec because W ang et al. [30] ha v e con- ducted e xperiments on Na ¨ ıv e Bayes, Logistic Re gression and linear support v ector classifier (SVC) for short te xt classification using tf-idf weighting, w ord2v ec and paragraph2v ec (doc2v ec), and the y ha v e report ed that tf-idf/counter feature has the highest accurac y , while w ord2v ec ne xt, and doc2v ec has the lo west accurac y . T able 3 summarizes the classification result of the four classifiers on SST -5 dataset. T able 3. Sentiment analysis classification result Macro a v erage precision Macro a v erage recall Macro a v erage f1-score W eighted a v erage precision W eighted a v erage recall W eighted a v erage f1-score Accurac y Random F orest (unigrams) 0.40 0.31 0.30 0.40 0.36 0.33 0.36 Random F orest (bigrams) 0.34 0.29 0.28 0.34 0.32 0.31 0.32 Random F orest (trigrams) 0.29 0.23 0.20 0.31 0.23 0.22 0.23 Random F orest (tf-idf unigrams) 0.40 0.30 0.28 0.39 0.35 0.31 0.35 Random F orest (tf-idf bigrams) 0.34 0.29 0.28 0.34 0.32 0.31 0.32 Random F orest (tf-idf trigrams) 0.28 0.22 0.20 0.29 0.23 0.21 0.23 Multinomial Nai v e Bayes (unigrams) 0.43 0.38 0.38 0.43 0.43 0.41 0.43 Multinomial Nai v e Bayes (bigrams) 0.36 0.30 0.29 0.36 0.35 0.32 0.35 Multinomial Nai v e Bayes (trigrams) 0.31 0.26 0.24 0.31 0.29 0.26 0.29 Multinomial Nai v e Bayes (tf-idf unigrams) 0.48 0.34 0.29 0.46 0.41 0.34 0.41 Multinomial Nai v e Bayes (tf-idf bigrams) 0.38 0.29 0.24 0.38 0.35 0.29 0.35 Multinomial Nai v e Bayes (tf-idf trigrams) 0.29 0.24 0.19 0.30 0.29 0.23 0.29 Logistic Re gression (unigrams) 0.42 0.37 0.37 0.42 0.41 0.39 0.41 Logistic Re gression (bigrams) 0.38 0.28 0.23 0.37 0.34 0.27 0.34 Logistic Re gression (trigrams) 0.36 0.23 0.18 0.35 0.28 0.22 0.28 Logistic Re gression (tf-idf unigrams) 0.42 0.35 0.34 0.41 0.40 0.37 0.40 Logistic Re gression (tf-idf bigrams) 0.43 0.28 0.23 0.41 0.35 0.27 0.35 Logistic Re gression (tf-idf trigrams) 0.30 0.23 0.17 0.32 0.29 0.21 0.29 Linear SVM (unigrams) 0.38 0.37 0.37 0.39 0.40 0.39 0.40 Linear SVM (bigrams) 0.33 0.31 0.31 0.34 0.34 0.33 0.34 Linear SVM (trigrams) 0.31 0.25 0.22 0.32 0.29 0.25 0.29 Linear SVM (tf-idf unigrams) 0.38 0.38 0.38 0.39 0.41 0.39 0.41 Linear SVM (tf-idf bigrams) 0.33 0.31 0.31 0.34 0.34 0.33 0.34 Linear SVM (tf-idf trigrams) 0.31 0.27 0.25 0.31 0.30 0.27 0.30 From T able 3, we can see that Multinomial Na ¨ ıv e Bayes classifier achie v es the best classification result when it’ s trained with unigrams. Logistic Re g r ession and linear SVM classifiers also g a v e good result when the y are trained with unigrams or tf-idf unigrams. The w orst results are pro vided by Random F orest since it achie v es a 0.36 accurac y in its best. Figure 5 depicts the confusion matrix of Multinomial Nai v e Bayes (unigrams) for SST -5 test set. W e mention that B E R T base achie v es a 0.45 accurac y and 0.40 macro a v erage f1-score, GR U-RNN- W ORD2VEC achie v es a 0.45 accurac y and r ecursi v e neural tensor netw ork achie v es a 0.46 accurac y , Besides, deep learning algorithm tak es a long time to train as sho wn in T able 4 due to the lar ge number of parameters. Based on that, we ha v e made the choice of applying Multinomial Na ¨ ıv e Bayes classifier since it achie v es an accurac y of 0.43 and it doesn’ t require substantial computing po wer to be trained. T able 4 depicts the training time of bidirectional g ated recurrent unit (Bi-GR U), bidirectional long short- term memory (Bi-LSTM), recurrent neural netw ork (RNN) and multinomial na ¨ ıv e bayes (MNB) for SST -5 dataset. One of the benefits of fine-grained opinion mining is that it pro vides a better understanding of the distrib ution of re vie ws o v er the v e emotion classes, therefore, visualizing these v e classes will help users and customers mak e up their minds about the tar get item (b uying, renting). Int J Elec & Comp Eng, V ol. 11, No. 2, April 2021 : 1613 1626 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 r 1621 Figure 5. Confusion matrix of multinomial nai v e bayes (unigrams) for SST -5 test set T able 4. T raining time of bidirectional g ated recurrent unit (Bi-GR U), bidirectional long short-term memory (Bi-LSTM), recurrent neural netw ork (RNN) and multinomial na ¨ ıv e bayes (MNB) for SST -5 dataset Model Epochs Batch size T raining time (seconds) Bi-GR U 50 64 210.10 Bi-LSTM 50 64 180.25 RNN 50 64 85.26 MNB 3.77 4.3. Reputation e v aluation MTVRep of fers a holistic reputation visualization form as sho wn in Figure 6 by depicting the numer - ical reputation v alue and the distrib ution of re vie ws o v er the v e emotion classes, T able 5 sho ws comparison results between MTVRep and pre vious studies in term of visualizing reputation. An important issue that w as ne glected in the past research on reputation generation is identi fying the sentiment strength during the phase of opinion mining. Actually , e xisting studie s ha v e only focused on classifying re vie ws as positi v e or ne g ati v e, disre g arding senti ment intensity . Therefore, we propose MTVRep, a mo vie and TV sho w reputat ion system that combines fine-grained sentiment analysis and semantic analysis for the purpose of genera ting and visualizing reputation to w ard mo vies and TV sho ws. T able 6 depicts the features e xploited by pre vious studies and MTVRep during reputation generation and visualization. In order to e v aluate the performance of MTVRep in generating accurate reputation v alues t o w ard v arious mo vies and TV sho ws, we compared it with Y an et al. [11] reputation system. W e set the opinion fusion threshold t 0 to 0.15 since the authors mentioned that their reputation system performs in its best when t 0 = 0 : 15 . W e applied the tw o reputation systems on the twenty collected datasets. The chosen e v aluation measure is the squared error between the mo vie or TV sho w IMDb weighted a v erage ratings and the numerical reputation v alue computed by one of the tw o reputation systems. The formula of the squared error is: S E = ( x i y i ) 2 where x i is the reputation v alue returned by one of the tw o systems and y i is the IMDb W eighted A v erage Ratings to w ard the tar get mo vie or TV sho w . Figure 7 depicts the IMDb weighted a v erage ratings for forrest gump mo vie. According to IMDb in https://help.imdb .com/article/imdb/track-mo vies-tv/weighted-a v erage-rati ngs/ GWT2DSBYVT2F25SK?ref = hel psect p r o 2 8 # : ”IMDb publishes weighted vote aver a g es r ather than r aw data aver a g es. V arious filter s ar e applied to the r aw data in or der to eliminate and r educe attempts at vote MTVRep: A mo vie and TV show r eputation system based on... (Abdessamad Benlahbib) Evaluation Warning : The document was created with Spire.PDF for Python.
1622 r ISSN: 2088-8708 stuf fing by people mor e inter ested in c hanging the curr ent r ating of a mo vie than giving their true opinion of it. The e xact methods we use will not be disclosed. This should ensur e that the policy r emains ef fective . The r esult, is a mor e accur ate vote aver a g e . The moti v ation behind choosing the squared error instead of absolute error resides in the f act that reputation systems don’ t tolerate high error v alues. Consequently , the squared error wil l penalize lar ge errors more. Figure 8 and 9 sho w the comparison result between the tw o reputation systems o v er the twenty datasets. As illustrated in Figure 8, MTVRep produces the nearest reputation v alue to IMDb weighted a v erage ratings for the first ten datasets that contain mo vie re vie ws com p a red to reputation system [11]. W e observ e that the squared error of reputation system [11] e xceeds 2.5 in dataset 1, dataset 4, dataset 7 and dataset 9. W e also observ e that the squared error of MTVRep doesn’ t surpass 0.1 in dataset 3, 5 and 10, which implies that the system generates accurate reputation v alues to w ard mo vies since the highest squared error achie v ed by MTVRep is 1.87 (dataset 6). Figure 6. Reputation visualization Int J Elec & Comp Eng, V ol. 11, No. 2, April 2021 : 1613 1626 Evaluation Warning : The document was created with Spire.PDF for Python.