Inter national J our nal of Electrical and Computer Engineering (IJECE) V ol. 9, No. 3, June 2019, pp. 2152 2163 ISSN: 2088-8708, DOI: 10.11591/ijece.v9i3.pp2152-2163 r 2152 Opinion mining on newspaper headlines using SVM and NLP Chaudhary J ashubhai Rameshbhai, J oy P aulose Dept. of Computer Science, Christ Uni v ersity , India Article Inf o Article history: Recei v ed Jan 4, 2018 Re vised Jul 23, 2018 Accepted Dec 15, 2018 K eyw ords: Ne wspaper Sentiment analysis Opinion mining NL TK Stanford coreNLPm SVM SGDClassifier Tf-idf CountV ectorizer ABSTRA CT Opinion Mining also kno wn as S entiment Analysis, is a technique or procedure which uses Natural Language processing (NLP) to classify the outcome from te xt. There are v arious NLP tools a v ailable which are used for processing te xt data. Multiple research ha v e been done in opinion mining for online blogs, T witter , F acebook etc. This pa- per proposes a ne w opinion mining technique using Support V ector Machine (SVM) and NLP tools on ne wspaper headlines. Relati v e w ords are generated using Stan- ford CoreNLP , which is passed to SVM using count v ectorizer . On comparing three models using confusion matrix, results indicate that Tf-idf and Line ar SVM pro vides better accurac y for smaller dataset. While for lar ger dataset, SGD and linear SVM model outperform other models. Copyright © 2019 Institute of Advanced Engineering and Science . All rights r eserved. Corresponding A uthor: Chaudhary Jashubhai Rameshbhai, Department of Computer Science, Christ Uni v ersity Hosur Road, Bang alore, Karnataka, India 560029. Phone: +91 7405497405 Email: chaudhary .rameshbhai@cs.christuni v ersity .in 1. INTR ODUCTION Opinion Mining or Sentiment Analysis is a task to analyze opinions or sentiments from te xtual data. It is useful in analyzing NLP applications. W ith the de v elopment of applications lik e netw ork public opinion analysis, the demand on sentiment analysis and opinion mining is gro wing. In today's w orld, utmost people are using internet and social media platforms, to share their vie ws. These analysis are a v ailable in v arious forms on the internet, lik e re vie ws about product, F acebook post as feedback, T witter feeds, blogs etc. At present, ne ws play a dynamic role in de v eloping a person's visions and opinions related to an y product, political party or compan y . Ne ws article published in the ne wspaper or shared on the web can sometimes create ne g ati v e or positi v e impacts on the society on lar ge scale. As per Dor (1), most of the people judge the ne ws contents directly by scanning only the ne ws headlines relati v ely than going through the complete story . Hence, minor headlines can also impact on lar ge scale. In this paper , Opinion mining is performed based on just the headlines without going through whole articles. The proposed method be gins with data collection and preprocessing. Data are collected from dif ferent ne ws source using Python ne wspaper package. Re vie w for ne ws headlines are assigned either +1 or -1 manually based on sentiments. In order to perform SVM to b uild a classification model, ne ws headl ine data are fetched and processed in CoreNLP (2). CoreNLP returns set of relati v e w ords which are imported into count v ectorizer (3) to generate matrix. This paper is or g anized as follo ws: Section 2 pro vides o v ervie w of related w orks on opinion mining. Section 3 contains elaborate e xplanation of the proposed me thod. Section 4 discusses the e xperimental results of three models. Section 5 concludes and pro vides future scope of this proposed method. J ournal homepage: http://iaescor e .com/journals/inde x.php/IJECE Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 r 2153 2. RELA TED W ORK Ag arw al et al. (4) proposed a method containing tw o algorithms. First algorithm is used for data pre-processing while other to detect polarity v alue of a w ord. Natural Language Processing T ool (NL TK) and SentiW ordNet are emplo yed to b uild the proposed method. NL TK is a Python library for w ord tok enization , POS (P art of Speech) - T agging , Lemmatization and Stemming. SentiW ordNet (5) le xicon, a method e xtended from W ordNet is specifically designed for Sentiment Analysis. Each w ord is assigned with positi v e or ne g ati v e numerical scores. Ne g ati v e w ords are denoted with ne g ati v e score, while positi v e w ords are as signed with positi v e score. Output of NL TK is fed to Senti W o r dNet in order to assign numerical score for each w ord and compute the final sentiment score, which is sum of all the numeric al scores. If the final v alue is greater or equal to 0, the headline is classified as positi v e or ne g ati v e headline. Y ang et al. (6) proposed a h ybrid model for anal yzing sentiments of te xtual data for a single domain. It is implemented domain wise due to increase in comple xity upon se gre g ation. Single classification model is used to se gre g ate the responses as positi v e, ne g ati v e and neutral. The model is a combination of multiple single classification method which pro vides more ef ficient classification. Rana and Singh (7) compared tw o machine learning algorithms Linear SVM and Nai v e Bayes for Sentiment Analysis. Re vie w on mo vies are used as dataset containing 1000 samples. Proter Stemmer is em- plo yed to preprocess dataset. While Rapid Miner tool is used to generate model, Linear SVM and Nai v e Bayes are used as classifier . Precision, recall and accurac y of both the models are calculated. From the result, it is observ ed that the Linear SVM gi v es better result than Nai v e Bayes. Bakshi et al. (8) proposed an approach to classify positi v e, ne g ati v e and neutral tweets of T witter . It is focused on a single compan y Samsung Electronics Ltd. Data is fed and processed to clean the data. The algorithm is applied to analyze the sentiment of tweets and se gre g ate into dif ferent cate gories. Aroju et al. (9) proposed a method to perform Opinion Mining on three dif ferent ne wspapers related to similar ne ws. SVM and Nai v e Bayes are used for opinion mining. Around 105 ne ws headlines are collected from three dif ferent sources (35 headlines from The Hindu, The T imes of India and Deccan Chronicle Ne ws- paper). Data is processed using POS - tagger and stemming. W eka tool is used t o implement the method. F or e xperimental results, F-score, Precis ion and Recall are calculated. From the result i t is e vident that The Hindu ne wspaper contains more positi v e ne ws than The T imes of India and Deccan Chronicle. Hasan et al. (10) proposed an algorithm which uses Nai v e Bayes to perform opinion mining. Data are re vie wed in English from e-commerce website Amazon. The re vie ws are also translated to Bangla using Google T ra n s lator . Opinion mining is calculated for re vie ws in both Bangla and English. The Bangla dataset translated from English contains noise. The re vie ws in Bangla as training data are fed into Nai v e Bayes to b uild classifier e xcluding noisy w ords. Akkineni et al. (11) proposed a method of classifying opinion, opinions are classified based on subject of opinions and objecti v e of ha ving such opinion, this method helps to classify whether sentence is a f act or opinion. Approach adopted for classifying in this paper range are heuristic approach where results within a realistic time-frame. The y are lik ely to produce the results themselv es b ut are mostly used with the optimized algorithms, Discourse Structure which focuses on the gi v en te xt that just communicates a message, and linking it to ho w that message constructs a social reality or vie w of the w orld,k e y w ord analysis which classifies te xt by af fect cate gories based on the presence of unambiguous af fect w ords such as happ y , sad, afraid, and bored, Concept analysis which concentrates on semantic analysis of te xt through the use of web ontologies or semantic netw orks.The conceptual and af fecti v e information associated with natural language opinions are aggre g ated. Arora et al. (12) proposed Cross BOMEST , a cross domain sentimental classification. Existing method BOMEST , it retrie v es +v e w ords from a content, follo wed by determination of +v e w ord with as- sistance of Ms W ord Introp. In order to escalate the polarity it replaces all it’ s synon ym. Moreo v er , it helps in blending tw o dif ferent domains and detect self-suf ficient w ords. The proposed method is test and implemented on Amazon dataset. T otal of 1500 product re vie ws are random ly selected for both +v e and -v e polarity . Out of which 1000 are used for training and remaining to test the classification model. As a result, when applying on cross domain precision, accurac y of 92% is achie v ed. F or single domain, precision and recall of BOMEST is impro v ed by 16% and 7%. Thus, Cross BOMEST impro v es the precision and accurac y by 5% when compared to other e xisting techniques. Susanti et al. (13) emplo ys Multinomial Na ¨ ıv e Bayes T ree which is combination of Multinomial Na ¨ ıv e Bayes and Decision T ree. The technique is used in data mining for classification of ra w data. Multinomial Na ¨ ıv e Bayes method is used specifically to address frequenc y calculation in the te xt of the sentence or docu- Opinion mining on ne wspaper headlines using ... (Chaudhary J ashubhai Rameshbhai) Evaluation Warning : The document was created with Spire.PDF for Python.
2154 r ISSN: 2088-8708 ment. Documents used in this s tudy are comments of T witter users on the GSM telecommunications pro vider in Indonesia.[] This paper used the method to cate gorize customers sentiment opinion to w ards telecommunica- tion pro viders in Indonesia. Sentiment analysis only consists of positi v e, ne g ati v e and neutral class. Decision T ree is generated with this method and roots in the feature ”aktif”, where probability of feature ”aktif” belongs in positi v e class. In result and analysis, it is indicated that the highest accurac y of classification using Multino- mial Na ¨ ıv e Bayes T ree (MNBT ree) method is 16.26% when using 145 features. Furthermore, the Multinomial Na ¨ ıv e Bayes (MNB) yields the highest accurac y of 73,15% by using all dataset of 1665 features. In this type of research selection of appropriate feature is one of the challenges. Man y r esearchers are using decision tree and n-gram approach for feature selection and Supervised Machine learning technique for model b uilding. The tedious job in this type of research is data preprocessing. Most of the researchers are using NLP tools for data preprocessing (14), (15), (16), (17), (18). In this paper , n-gram and coreNLP is used for feature selection and Linear SVM are used for model b uilding. 3. PR OPOSED W ORK Man y algorithms are a v ailable for finding sentiment from te xt, b ut the y are performed on lar ge te xt dataset lik e mo vie re vie ws, product re vie ws etc. Finding opinions from ne ws headlines is also possible b ut the accurac y of e xisting algorithm is not satisf actory . This paper tries to impro v e the accurac y of e xisting algorithm (4) using dif ferent approach. The proposed method is distrib uted into three processes. As sho wn in Figure 1, Process I is re g arding data collection and pre-processing. Process II is the core fragment to b uild classifier and Process III tests the classification model on test data. These processes are discussed in detail further . Data Pre- processing Model Building Model Ev aluating Figure 1. Process diagram 3.1. Data Pr e-pr ocessing And Model Building In this process, data pre-processing and model b uilding are implemented. Figure 2 depicts the basic flo wchart of the data preprocessing and model b uilding. Stop w ords are remo v ed from ne ws headlines follo wed by con v erting uppercase te xts to lo wercase. The semi-processed headlines are fed to coreNLP . The output of coreNLP with sentiment score s are set as input for process II. The input is recei v ed from process I and con v erted into unigram and bi-gram representation. Model A is generated from the unigram and bi-gram representation. Model A emplo ys Linear SVM. Data representation is further con v erted into Tf-idf re sulting in Model B and C. Model B and C emplo y Linear SVM. Ho we v er , unlik e model B, model C uses SGD classifer to train the data. Ne ws Headlines Dataset with Sentiment Score Remo v e All Stop W ords from Headlines Con v ert All Head- lines to Lo wercase Pro vide Lo wercase Headlines as Input to coreNLP Output : Pre- processed Ne ws Headlines W ith Sentiment Score Unigram and Bi-gram Repre- sentation of Data Use Tf-idf for T erm Frequenc y SVM + SGD SVM C B A Figure 2. Flo w diagram of data pre-processing and model b uilding IJECE, V ol. 9, No. 3, June 2019 : 2152 2163 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 r 2155 Data are collected from the website http://www .indiane xpress.com during August 2017. Figure 3 depicts sample unprocessed data. 1472 ne ws headlines are collected and manually classified as either +1 or -1. F or positi v e ne ws, sentiment score is +1, while for ne g ati v e ne ws is -1. Subsequent ly , after allocating the sentiment score, all the stop w ords from headlines such as: is, the, are etc are eliminated. NL TK ST OP W ORDS is used to perform the task of remo ving stop w ords. The algorithm is case sensiti v e for which all the headlines are con v erted to lo wercase. The data is fed to Stanford CoreNLP to generate the dependenc y parser . Standford CoreNLP dependenc y parser (19) checks the grammatical construction of a sentence, which establishes relation between ”R OO T” w ord and altering w ords. T o understand ho w coreNLP w orks consider ne ws headlines: ”T w o killed in car bomb in Iraq Kirkuk”. Figure 4 depicts parsing of sample data using dependenc y parser . ”R OO T” w ord of the headline is returned and relation between each w ord of the headline. From the sample data, killed is returned as ”R OO T” w ord and relation between all the w ords. Figure 5 sho ws parsing of sample headline using dependenc y parser and con v erting the output in the form of string array . String array consists of ”R OO T” w ord and relati v e w ords. The process is applied on all the data to generate array of strings with ”R OO T” w ord and relati v e w ords. Figure 6 is statistical representation of frequenc y of w ords in data. Figure 7 depicts sample data with sentiment score. The processed data are input for process II. Figure 3. Dataset screenshot Figure 4. Dependenc y parser w orking diagram Figure 5. Generating pandas dataframe using coreNLP Opinion mining on ne wspaper headlines using ... (Chaudhary J ashubhai Rameshbhai) Evaluation Warning : The document was created with Spire.PDF for Python.
2156 r ISSN: 2088-8708 Figure 6. Most frequently occurred w ords in dataset Figure 7. Sample dataset after applying coreNLP 3.2. Unigram and Bi-gram Repr esentation of Data Before b uilding the model, the ra w data is con v erted from string to numerical v alues. In machine learning, T e xt Analysis is a k e y application area. Most of the algorithms accept numerical data with fix ed size rather than te xt data of v arying size. A collecti v e approach uses a document-term v ector where indi vidual document is encrypted as a discrete v ector that sums occurrences for each w ord in the v ocab ulary it contains (3). F or e xample, consider tw o one-sentence documents: D1: ”I lik e Google Machine Learning course” D2: ”Machine Learning is a wesome” The v ocab ulary V = f I, lik e, Google, Machine, Learning, course, is, a wesome g and tw o documents can be encoded as v1 and v2. Figure 8 and 9 sho w the representation of gi v en sentences in unigram and bi-gram model. The bi-gram model refines data representation where occurrences are determined by a sequence of tw o w ords rather than indi vidually . IJECE, V ol. 9, No. 3, June 2019 : 2152 2163 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 r 2157 Figure 8. Unigram representation of gi v en v ocab ulary Figure 9. Bi-gram representation of gi v en v ocab ulary Figure 10 sho ws snipped code to generate unigram and bi-gram representation of gi v en data. Data is an array of 4 ne ws headlines. Here, C ountV ector iz er () is used to con v ert string data into numeric v alues. T o generate unigram model, pass ar gument ng r am r ang e = (1 ; 1) and ng r am r ang e = (1 ; 2) for bi-gram. In figure, there are tw o matrices generated using pandas library . In matrix, columns represents unique w ords (31 for unigram and 61 for bi-gram) and ro ws r epresents 4 ne ws headlines. If the w ord e xists in particular ne ws headlines the v alue for that particular feature will be 1 and if the w ord e xists twice then the v alue for that feature will be 2. V alue depends on frequanc y of w ord in headline. This method is used when a model is b uilt in Process II. d a t a = [ C o a l B u r y i n g G o a : W h a t t h e t o x i c t r a i n l e a v e s i n i t s w a k e , C o a l B u r y i n g G o a : L i v e s t o u c h e d b y c o a l , C o a l b u r y i n g G o a : D a n g e r a h e a d , n e w c o a l c o r r i d o r i s c o m i n g u p , G o a m i n i n g : S u p r e m e C o u r t i s s u e s n o t i c e s t o C e n t r e , s t a t e g o v e r n m e n t ] c l f = C o u n t V e c t o r i z e r ( n g r a m r a n g e = ( 1 , 1 ) ) d f = c l f . f i t t r a n s f o r m ( d a t a ) . t o a r r a y ( ) p d . D a t a F r a m e ( d f ) 0 1 2 3 . . . 0 0 1 0 0 . . . 1 0 1 1 0 . . . 2 1 1 0 0 . . . 3 0 0 0 1 . . . 4 r o w s * 3 1 c o l u m n s c l f = C o u n t V e c t o r i z e r ( n g r a m r a n g e = ( 1 , 2 ) ) d f = c l f . f i t t r a n s f o r m ( d a t a ) . t o a r r a y ( ) p d . D a t a F r a m e ( d f ) 0 1 2 3 . . . 0 0 1 0 0 . . . 1 0 1 1 0 . . . 2 1 1 0 0 . . . 3 0 0 0 1 . . . 4 r o w s * 6 1 c o l u m n s Figure 10. Snipped p ython code to generate Unigram and Bi-gram from gi v en string data. Opinion mining on ne wspaper headlines using ... (Chaudhary J ashubhai Rameshbhai) Evaluation Warning : The document was created with Spire.PDF for Python.
2158 r ISSN: 2088-8708 3.3. Model A. Linear SVM In this process, Linear SVM is used to b uild the model and the data that has been used to b uild the model is Numeric. The Description for total dataset is sho wn in T able 1. The Matrices i n Figure 11 and 12 ha v e been generated by using unigram and bi-gram, column sho ws the number of w ords in headlines and ro w represents number of headlines. When the v alue in Matrix is 0, it means that particular w ord does not e xist in the headlines. In unigram, number of feature depends on total unique w ords in dataset. T able 1. Dataset Description T otal Sample T otal Feature Unigram 1472 4497 Bi-gram 1472 13832 Figure 11. Dataset representation in Unigram model. Figure 12. Dataset representation in Bi-gram model. T able 1 sho ws total number of features and samples in unigram and bi-gram. T otal sample size is total number of ne ws headlines and total feature size is number of unique w ords i n dataset. Here, total sample dataset si ze is same for both models. But in unigram model total number of feature is 4497 and 13832 for bi-gram model. F or b uilding this model, 80% data are considered for training set and 20% are considered for e v aluating the model. Here, k ernel is linear because we ha v e tw o class labels, so SVM generates linear h yper plane which will separate w ords. It separates all ne g ati v e ne ws headline w ords and positi v e ne ws headline w ords. 3.4. Model B. Tf-idf and Linear SVM Linear SVM is used in this model b uilding and the dataset is con v erted into document frequenc y using Tf-idf. Tf is T erm-frequenc y while Tf-idf (3) is T erm-frequenc y time’ s in v erse document-frequenc y . It is used to classify t he documents. The main aim of Tf-idf is to calculate the importance of a w ord in an y gi v en headline with respect to o v erall occurrence of that w ord in the dataset. The importance of a w ord is high if it is frequent in the headline, b ut less frequent in o v erall headline. Tf-idf can calculated as follo ws (3): tf id f ( t; d ) = tf ( t; d ) id f ( t ) (1) IJECE, V ol. 9, No. 3, June 2019 : 2152 2163 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 r 2159 Where tf(t,d) is term frequenc y in particular headline, the term occurred number of times in particular headline and is multiplied with idf(t). id f ( t ) = 1 + l og 1 + n d 1 + d f ( d; t ) (2) Where n d is the total number of headlines, and df(d,t) is the number of headlines that contain term t. The resulting Tf-idf v ectors are then normalized by the Euclidean norm: v nor m = v jj v jj 2 = v p v 2 1 + v 2 2 + :: + v 2 n (3) d a t a c o u n t s = [ [ 3 , 0 , 1 ] , [ 2 , 0 , 0 ] , [ 3 , 0 , 0 ] , [ 4 , 0 , 0 ] , [ 3 , 2 , 0 ] , [ 3 , 0 , 2 ] ] F or e xample, Tf-idf is computed for the first term in the first document in the data counts array as follo ws: T o calculate Tf-idf of first term in document: T otal No. of Documents : n d;ter m 1 = 6 T otal No of Documents which contain this term 1 : d f ( d; t ) ter m 1 = 6 idf is for term 1 : id f ( d; t ) ter m 1 = 1 + l og n d d f ( d;t ) = l og 6 6 + 1 = 1 tf id f ter m 1 = tf ter m 1 id f ter m 1 = 3 1 = 3 Similarly for other tw o terms: tf id f ter m 2 = 0 ( 6 1 + 1) = 0 tf id f ter m 3 = 0 ( 6 1 + 1) = 2 : 0986 Represent Tf-idf in v ector: tf id f r aw = [3 ; 0 ; 2 : 0986] After applying Euclidean norm: [3 ; 0 ; 2 : 0986] p (3 2 +0 2 +2 : 0986 2 ) = [0 : 819 ; 0 ; 0 : 573] Opinion mining on ne wspaper headlines using ... (Chaudhary J ashubhai Rameshbhai) Evaluation Warning : The document was created with Spire.PDF for Python.
2160 r ISSN: 2088-8708 In idf(t), to a v oid zero di visions ”smooth idf=T rue” adds ”1” to the numerator and denominator .After modification in equation the first tw o term v alue will be same b ut in term3 v alue changes to 1.8473: tf id f ter m 3 = 1 l og 7 3 + 1 = 1 : 8473 [3 ; 0 ; 1 : 8473] p (3 2 +0 2 +1 : 8473 2 ) = [0 : 8515 ; 0 ; 0 : 5243] S i m i l a r l y b y c a l c u l a t i n g e v e r y v a l u e i n d a t a c o u n t s a r r a y t h e f i n a l o u t p u t w i l l b e : T f i d f = T f i d f T r a n s f o r m e r ( ) X = T f i d f . f i t t r a n s f o r m ( d a t a c o u n t s ) X . t o a r r a y ( ) o u t p u t o f i s : a r r a y ( [ [ 0 . 8 5 1 5 , 0 . 0 0 0 0 , 0 . 5 2 4 3 ] , [ 1 . 0 0 0 0 , 0 . 0 0 0 0 , 0 . 0 0 0 0 ] , [ 1 . 0 0 0 0 , 0 . 0 0 0 0 , 0 . 0 0 0 0 ] , [ 1 . 0 0 0 0 , 0 . 0 0 0 0 , 0 . 0 0 0 0 ] , [ 0 . 5 5 4 2 , 0 . 8 3 2 3 , 0 . 0 0 0 0 ] , [ 0 . 6 3 0 3 , 0 . 0 0 0 0 , 0 . 7 7 6 3 ] ] ) The total number of sample data size and feature data size will remain same for b uilding this model. Here, the frequenc y o f each w ord is changed according to Tf-idf. Unlik e the pre vious model w ords with less frequenc y in hea dlines will ha v e lo wer v alues. It means w ords with lesser frequenc y will ha v e lesser impact on model. F or training, this model is using 80% data and for testing it is using 20% data. 3.5. Model C. Stochastic Gradient Descent (SGD) Classifier The SGD is used to tra in the data for Linear SVM. SGD (3) is a discriminati v e learning of linear classifiers lik e SVM and Logistic Re gression which is simple and v ery ef ficient . SGD has been ef fecti v ely implemented in numeric data machine learning problems majorly in v olv ed in te xt cate gorization and NLP . Data pro vided is in sparse, the classifiers in SGD ef ficiently scales to the problem with more than 10 5 training samples and with more than 10 5 attrib utes. The major adv antage of SGD is that it can handle lar ge dataset. Here, in this research problem small si ze of dataset has been used b ut this approach can be e xtended this research for up to 10 5 features. 4. MODEL EV ALU A TING Result of three dif ferent models are compared. The confusion matrix has been used as a performance metric. T able 2 describes the structure of confusion matrix. The formula for accurac y of model is sho wn in eq. 4. T able 3, 4 and 5 are confusion matrices of Model A, B and C and T able 6 sho ws the accurac y score of three models. This table implies that the bi-gram will gi v e more accurate result than unigram. Ho we v er , in unigram model, number of feature is less than bi-gram model, due to which time in b uilding the model in unigram is less t han bi-gram. Here, the accurac y o f Model B is higher than Model A because it is trained with Tf-idf. W ith the increase in feature size ( > 20000), Model A and B will not pro vide feasible solutions. T o o v ercome such issues Model C is introduced in this paper and it is trained using SGD, it supports up to 10 5 features (3) for b uil d i ng a m odel. Thus Model C can be used when the feature size is high, otherwise Model B w orks well when the feature size is less. IJECE, V ol. 9, No. 3, June 2019 : 2152 2163 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 r 2161 Ne ws Headlines Dataset without Sentiment Score B A C Generate Con- fusion Matrix Generate Con- fusion Matrix Generate Con- fusion Matrix Result of Model A Result of Model B Result of Model C Figure 13. Models e v aluation T able 2. Confusion Matrix PREDICTED TR UE F ALSE A CTU AL TR UE TP FN F ALSE FP TN Accur acy S cor e = T P + T N T P + T N + F P + F N 100 % (4) T able 3. Model A (Linear SVM) Confusion Matrix Unigram Model PREDICTED TR UE F ALSE A CTU AL TR UE 223 29 F ALSE 9 34 Bi-gram Model TR UE F ALSE A CTU AL TR UE 228 27 F ALSE 4 36 T able 4. Model B (Linear SVM + Tf-idf) Confusion Matrix Unigram Model PREDICTED TR UE F ALSE A CTU AL TR UE 227 22 F ALSE 5 41 Bi-gram Model TR UE F ALSE A CTU AL TR UE 228 21 F ALSE 4 42 Opinion mining on ne wspaper headlines using ... (Chaudhary J ashubhai Rameshbhai) Evaluation Warning : The document was created with Spire.PDF for Python.