Inter national J our nal of Electrical and Computer Engineering (IJECE) V ol. 7, No. 2, April 2017, pp. 1071 1087 ISSN: 2088-8708 1071       I ns t it u t e  o f  A d v a nce d  Eng ine e r i ng  a nd  S cie nce   w     w     w       i                       l       c       m     Cr edal Fusion of Classifications f or Noisy and Uncertain Data F atma Kar em 1 , Mounir Dhibi 2 , Ar naud Martin 3 , and Med Salim Bouhlel 4 1,4 Research Unit SETIT , Higher Institute of Biotechnology Sf ax, 3038 T unisia 2 Research Unit PMI 09/UR/13-0, Zarroug ISSA T Gafsa 2112 T unisia 3 Uni v ersity of Rennes 1, UMR 6074 IRISA, Edouard Branly street Bp 30219, 22302 Lannion Cede x France Article Inf o Article history: Recei v ed Oct 24, 2016 Re vised Feb 8, 2017 Accepted Feb 22, 2017 K eyw ord: lustering Classification Combination Belief function theory Noise ABSTRA CT This paper reports on an in v estig ation in c lassification technique emplo yed to clas sify noised and uncertain data. Ho we v er , classification is not an ea sy task. It is a significant challenge to disco v er kno wledge from uncertain dat a. In f act, we can find man y problems. More time we don’ t ha v e a good or a big learning database for s upervised classification. Also, when training data contains noise or missing v alue s, classification accurac y will be af fected dra- matically . So to e xtract groups from data is not easy to do. The y are o v erlapped and not v ery separated from each other . Another proble m which can be cited here is the uncertainty due to measuring de vices. Consequentially classification model is not so rob ust and strong to classify ne w objects. In this w ork, we present a no v el classification al gorithm to co v er these problems. W e materialize our main idea by using belief function theory to do com- bination between classification and clustering. This theory treats v ery well imprecision and uncertainty link ed to classification. Experimental results sho w that our approach has ability to significantly impro v e the quality of classification of generic database. Copyright c 2017 Institute of Advanced Engineering and Science . All rights r eserved. Corresponding A uthor: F atma Karem Research Unit SETIT , Higher Institute of Biotechnology Sf ax, 3038 T unisia 60 street p yramids, Assala, Gafsa, 2100 T unisia Phone +216.27.951.381 Email f atoumakarem@gmail.com 1. INTR ODUCTION There are tw o broads of classification technique: supervised and unsupervised one. Supervised classification is the essential tool used for e xtracting quantitati v e information based on learning databas e. All e xtracted feat ures are assigned by labels e xamples. It tries to classify objects by measuring similarity between ne w and learning database. The second technique is based on clusters by measuring tw o criteria essentially compacity and separat ion [1],[2]. It tries to form clusters which are compact and separable the possible maximum. Grouping data is not e vident. Firstly clusters are o v erlapped most of the times. Secondly , data to classify are generally v ery comple x. Moreo v er , there is not a unique quality criteria to measure the goodness of classification. Generally , v alidity inde x is u s ed to measure the quality of clusteri n g. Until no w , there’ s no standard one which is uni v ersal. It v aries from an applicat ion to another . Data to classify are not al w ays correct especially in real applications. The y can be uncertain or ambiguous. The y are dependent of acquisition de vices or e xpert opinions. Con- sequently , the result of classification will be uncertain. Besides, labeled e xamples used for training may be sometimes not a v ailable. Due to these limits and for the objecti v e to impro v e classification process, we propose to combine clas- sification and clustering. This combination also named fusion procedure aims to tak e account of the complementarity between both. Clustering is used to o v ercome problems of learning and o v er -fitting. Combination is made by using belief functions theory . This theory is well kno wn in treating problems of uncertainty and imprecision. In this paper , we report our recent research ef forts to w ard this goal. Fi rst, we present basic concepts of belief functions theory . Then, we propose a no v el classification mechanism based on combination. Ne w process aims to impro v e classification results related to noisy en vironment and missing data. W e conduct e xperiments on generic data to sho w the quality of data mining results. The rest of this paper is or g anized as follo ws. Related w ork on J ournal Homepage: http://iaesjournal.com/online/inde x.php/IJECE       I ns t it u t e  o f  A d v a nce d  Eng ine e r i ng  a nd  S cie nce   w     w     w       i                       l       c       m     DOI:  10.11591/ijece.v7i2.pp1071-1087 Evaluation Warning : The document was created with Spire.PDF for Python.
1072 ISSN: 2088-8708 noise handling is discussed in subsection ”related w orks”. In Section III, we describe the details of proposed fusion mechanism. Experimental results and discussion are presented in Section IV . In the final section, we conclude this paper out the future w ork. 2. THEORETICAL B ASIS W e present here essentially the belief function theory our frame of fusion of information. Then, we present some w orks done in fusion of classifications. 2.1. Belief function theory Fusion is a combination process of multiple data or information coming from dif ferent sources in order to mak e a decision. The final decision is better than the indi vidual ones. The v ariety of information implied in the combination process mak es the added v alue. Combination is needed in problems where ambiguity and uncertainty are big. W e may be sometimes unable to mak e an indi vidual decision. T o raise the ambiguity , we must fuse. The applications requiring fusion are multiple. W e find medicine [3],[4] for e xample. Sometimes, it is dif ficult to do a good diagnosis disease indi vidually . It will be better to fuse between doctors opinions. T umor detection is the well kno wn application. W e find also image processing applications [5],[6], classification [7],[8], remote detection, artificial intelligence, pattern recognition [9] etc. The means of combination are multiple. W e call it uncertain theories. W e find v ote theory , possibility theory , probability theory and belief function theory . The latter sho ws rob ustness in front of uncertainty and imprecision problems. The theory is in v ented by Dempster in 1967 and resumed by Shafer . It is also called Dempster -Shafer theory [10],[11]. Belief function theory models beliefs in an e v ent by a function called mass function. W e note by m j the mass function of the source S j . It is defined in the set 2 , their v alues are in [0 ; 1] and v erify the constraint: X A 2 2 m j ( A ) = 1 (1) 2 is the set of decision or class disjunctions C i if we talk about classification: 2 = f; ; f C 1 g ; f C 2 g ; f C 1 [ C 2 g ; : : : ; g . The parts A of ha ving the condition m ( A ) > 0 are called focal elements. The set of focal ele- ments is called k ernel. m ( A ) is a measure of e vidence allocated e xactly to the h ypothesis X 2 A . Classes C i must be e xclusi v e a n d not necess arily e xhausti v e. Belief function theory measures imprecision and uncertainty by man y func- tions such as credibility and plausibility . Credibility i s the minimum belief. It tak es account of the conflict between sources. Credibility is defined by: C r j ( X ) = X Y X ;X 6 = ; m ( Y ) (2) Plausibility function measures the maximal belief in X 2 2 . W e suppose that all decisions are complete so we are in a closed frame of discernment: P l j ( X ) = X Y 2 2 ;Y \ X 6 = ; m j ( Y ) = C r j () C r j ( X c ) = 1 m j ( ; ) C r j ( X c ) (3) X c is the complement of X . T o represent a problem by the concepts of belief functions theory , we should respect three s teps: model ling, combination and decision. There’ s an intermediate step: discounting. It can be done before or after combination. It me asures the reliability of sources. A reliability coef ficient is used here noted by . The first step is the most crucial. W e must choose the suitable model to represent mass functions. It depends on the conte xt and the application. It can be computed by man y w ays. W e find essentially probabilistic and distance models. F or the second step: combination can be done by using dif ferent opera tors. The choice of the suitable operator depends on the conte xt. Man y h ypotheses control the operator such as the independence and reliability of sources. W e find man y operators such as conjuncti v e, disjuncti v e and cautious operators or rules [12]. The first suppose that sources are independent and reliable whereas the second suppose that one of both should be reli able. The cautious rule doesn’ t impose independence h ypothesis for the sources. So it allo ws dependence and redundanc y . This situation may be IJECE V ol. 7, No. 2, April 2017: 1071 1087 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 1073 encountered in practice for e xample e xperts may share some information. Classifiers may be trained on the same learning sets or not separate ones. The conjuncti v e combination fuse by considering the intersections between the elements of 2 . It reduces imprecision of focal ele ments and increases belief in the elements where sources agree. If we ha v e m mass functions to combine we ha v e the follo wing formula: m ( A ) = ( m 1 \ m 2 \ :::::::::::::: \ m M )( A ) = X B 1 T B 2 T ::::::: T B M = A M Y j =1 m j ( B j ) (4) F or the cautious rule it is defined as follo wing: m 12 = m 1 ^ m 2 = \ A A w 1 ( A ) ^ w 2 ( A ) = \ A A w 12 ( A ) (5) m 12 is the information g ained by the tw o sources S 1 and S 2 . It must be more informati v e than m 1 and m 2 . If we try t o formalize this we ha v e the follo wing: m 12 2 S ( m 1 ) \ S ( m 2 ) . S ( m ) is the set of mass functions more informati v e than m . T o choose the most i n f ormati v e mass function we apply the least commitment principle (LCP). It i s based on this principle: if se v eral mass functions are compatible with some constraints the least informati v e one in S ( m 1 ) \ S ( m 2 ) should be selected. This element is unique it is the non dogmatic mass function ( m () > 0 ) m 12 with the follo wing weight function: w 12 ( A ) : = w 1 ( A ) ^ w 2 ( A ) ; 8 A (6) w ( A ) is a representation of a non dogmatic mass function (simple mass function), it may be computed from m as follo ws: w ( A ) : = Y A B q ( B ) ( 1) j B jj A j +1 ; 8 A (7) q is the commonality function defined as: q ( A ) : = X B A m ( B ) ; 8 A (8) T o apply this principle some informational ordering between mass functions has to be chosen. Man y orderings can be used such as q-ordering and w-ordering. The first one af firms that m 1 is q-more committed or informati v e than m 2 noted by m 1 v q m 2 (9) if it v erifies the follo wing constraint: q 1 ( A ) q 2 ( A ) ; 8 A (10) The second one is ba sed on the conjuncti v e weight function: m 1 is w-more committed than m 2 (noted by m 1 v w m 2 ) if it v erifies the follo wing constraint: w 1 ( A ) w 2 ( A ) ; 8 A (11) After calculating the mass functions and combining, we obtain the masses relati v e to the dif ferent elements of the frame of discernment. W e must tak e a decision or af fect a class if we ha v e to classify at the end. It is made by using a criteria. Criterion are multiple. W e mention maximum of plausibility , maximum of credibility and pignistic proba- bility . F or the first criteria, we choose the singleton or class C i gi ving the maximum of plausibility . F or an object or Cr edal Fusion of Classifications for Noisy and Uncertain Data (Kar em) Evaluation Warning : The document was created with Spire.PDF for Python.
1074 ISSN: 2088-8708 v ector x , we decide C i if: P l j ( C i )( x ) = max 1 k n P l ( C k )( x ) (12) This criteria is optimistic because the plausibility of a singleton measures the belief obtained if all disjunction masses are focused on this one. Second criteria chooses C i for x if it gi v es the maximum credibility: C r j ( C i )( x ) = m ax 1 k n C r ( C k )( x ) (13) This criteria is more selecti v e because credibility function gi v es the minimum belief committed to a decision. The third criteria is between the tw o criterion. It mo v es closer credibility and plausibility . F or a class C i , the pignistic probability is defined as: bet ( C i ) = X A 2 2 ;C i 2 A m ( A ) j A j (1 m ( C i )) (14) j A j is the cardinality of A . The maximum of pignistic probability decide C i for an observ ation x if: bet ( C i )( x ) = max 1 k n bet ( C k )( x ) (15) This criteria is more adapted to a probabilistic conte xt. In the ne xt section, we present some w orks related to classifi- cation combination. 2.2. Related w orks Man y researches are done about fusion in classification. Most of them is about either clustering [13, 14, 15, 16, 17] neither classification [18, 11]. Some researches deplo y combination to impro v e classification performance. Other one deplo y fusion to construct a ne w classifier such as neural netw ork based on belief function [19] or credal K N N [20] or credal decision tree [21]. In [19], the study presents a solution to problems bound in bayesian model. Conditional densities and a priori probabilities of classes are unkno wn. The y can be estimated from learning samples. The estimation is not reli able especially if the set of learning database is small. Moreo v er , it can not represent v ery well uncertainty connected to class membership of ne w objects. If we dispose of fe w labeled e xamples and we ha v e to classify ne w object which is v ery dissimilar of other ones uncertainty will be big. This state of ignorance is not reflected by the outputs of statistical classifier . This situation is met in man y real applications lik e medicine: diagnosis disease. So it tries to measure uncertainty bound to the class of the ne w object considering the information gi v en by the learning data. Suppose that we ha v e a ne w object to classify , we focus on his neighbors. The y are considered as e vidence elements or h ypothese s about class membership of the ne w one. Masses are assigned to each class and for each neighbor of the ne w object to classify . The beliefs are represented by basic belief assignment and combined by Dempster -Shafer theory to decide to which class it belongs to. The study doesn’ t depend strongly of the number of neighbors. In [21], decision tree (classifying tw o classes) are combined to solv e multi-class problem using belief function theory . Classic decision trees bases on probabili ties. The y are not al w ays suitable to some problems lik e uncertainty . Uncertaint y of inputs and outputs can not be modelled v ery well by probability . Moreo v er , a good learning database is not al w ays a v ailable. The research proposes an e xtension to a pre vious study dealing with decision tree solving tw o class problem. It is based on belief function. The ne w study aims to treat multi-class problem by combining decision trees (tw o class problem) using e vidence theory . In [22], tw o supervised classifier are combined which are Support V ector Machines and K-Nearset neighbors. Combination aims to impro v e classification performance. Each of them has disadv antages. S V M for e xample depends strongly on learning samples. It is sensiti v e to the noise and the intruder . K N N is a statistical classifier . It is also sensiti v e to noise. A ne w h ybrid algorithm is proposed t o o v ercome the limits of both classifiers. Concerning the combination of clustering, man y res earches are done. In [13], a no v el classifier is proposed based on a collaboration between man y clustering techniques. The process of collaboration tak es place in three stages: parallel and independent clusterings, refinement and e v aluation of the results and unification. The second stage is the most dif ficult . Correspondence between the dif ferent clusters obtained by the classifiers is look ed. Conflict between results may be found. An iterati v e resolution of conflict is done in order to obtain a similar number of clusters. The possible actions to solv e conflicts are fusion, deletion and split of clusters. After that, results are unified thanks to v ote technique. Combination w as used to analyze multi-sources images. Fusion w as needed because sources are heterogeneous. In [23], man y techniques of clustering collaboration are presented. It dif fers by the type of result. Result can be a unique partition of data or an ensemble of clustering results. F or the IJECE V ol. 7, No. 2, April 2017: 1071 1087 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 1075 first t yp e of result, fusion techniques of classification are used. F or the second, multi-objecti v e clustering methods are used. The y try to optimize simultaneously man y criteria. At the end of process, the set of results is obtained. It is the best result that compromises between the criteria to be optimized. Concerning fusion between clustering and classification man y researches deplo y clustering in the learning phase of supervised classification [24],[25],[26]. 3. RESEARCH METHOD This w ork is an impro v ement of a pre vious one. The former [27] w as established to combine clustering and classification in order to impro v e their performance. Both has dif ficulties. F or clustering, we ha v e essentially problems of comple x data and inde x v alidity . F or clas sification, we ha v e problem of lack of learning database. W e used belief function theory to fuse. W e respect the three steps of combination process: modelling, combination and decision. Our frame of discernment is: 2 , : = f q j ; j : = 1 ; : : : ; n g where n number of classes q j found by the supervised classifier . F or modelling step, both sources must gi v e their beliefs in the classes. Unsupervised source gi v es as outputs clusters. The classes are unkno wn for it. Ho w can the clustering source gi v e their beliefs for it? T o do that, we look for the similarity between classes and clusters. More the similarity is big more the tw o cl assifications agree with each other . Generally to measure similarity we use distance. If we try to measure distance between a cluster and a class, we will confront a big problem which is the choice of the best distance. W e chose to look for the reco v ery between clusters and classes. More the y ha v e objects in common more the y are similar . Concerning supervised source, we used probabilistic model of Appriou. Only singletons interested us. In the combination phase, we adopted the conjuncti v e rule. It w orks in the intersections of the elements of the frame of discernment. At the end, we must decide to which class belong each object. The decision is made by using a criteria. W e decide fol lo wing the pignistic criteria. It compromises between credibility and plausibility . T o summarize the process, we ha v e the follo wings: Step 1: Modelling Masses are computed for both sources supervised and unsupervised: Clustering (unsupervised source): W e look for the proportions of found classes q 1 ; : : : ; q n by the supervised cl assifier in each cluster [14],[13]. 8 x 2 C i with c the number of clusters found. The mass function for an object x to be in the class q j is as follo ws: m ns ( q j ) = j C i \ q j j j C i j (16) where j C i j the number of elements in the cluster C i and j C i \ q j j , the number of elements in the intersection between C i and q j . Then we discount the mass functions as follo ws, 8 A 2 2 by: m ns i ( A ) = i m ns ( A ) (17) m i ns () = 1 i (1 m ns ()) (18) The discounting coef ficient i depends on objects. W e can not discount in the same w ay all the objects. An object situated in the center of a cluster is considered more representati v e of the cluster than another one situated in the border for e xample. The coef ficient i is defined as ( v i is the center of cluster C i ): i = e k x v i k 2 (19) Classification (supervised source): W e used the probabilistic model of Appriou: m j s ( q j )( x k ) = ij R s p ( q i j q j ) 1 + R s p ( q i j q j ) (20) m j s ( q c j )( x k ) = ij 1 + R s p ( q i j q j ) (21) Cr edal Fusion of Classifications for Noisy and Uncertain Data (Kar em) Evaluation Warning : The document was created with Spire.PDF for Python.
1076 ISSN: 2088-8708 m j s ()( x k ) = 1 ij (22) q i the real class, ij reliability coef ficient of the supervised classification concerning class q j . Conditional prob- abilities are computed from confusion matrices on the learning database: ij = max p ( q i j q j )( i 2 1 ; :::; n ) (23) R s = max q l ( p ( q i j q l )) 1 ( i; l 2 1 ; :::; n ) (24) Step 2: combination Use of conjuncti v e rule equation 4. Step 3: decision Use of pignistic criteria equation 15. Three impro v ements are aiming in the present paper: noise, missing data (uncertain data) and lack of learning database. In the pre vious w ork we ha v e supposed that data are correct. T o do so, we introduce certain modifications to the pre vious mechanism. T o compute masses for the supervised source we k eep Appriou’ s model 20,21,22. F or the unsu- pervised source, we follo w the ne xt steps: Step 1 : F or each cluster C i , we combine supervised masses of the objects belonging to by the conjuncti v e rule: 8 x k 2 C i ; A 2 2 ; m i ( A ) : = \ x k 2 C i m A s ( A )( x k ) (25) Thanks to that, we ha v e an idea of the proportion of labels present in a cluster . What’ s the majority class and minority ones. Step 2 : W e obtain c masses for each element A 2 2 with c number of clusters obtained. W e combine them by the conjuncti v e rule. W e can vie w ho w the tw o classifications agree with each other . More the masses tend to 1 more the y are not in conflict. Before combining, we discount masses using a reliability coef ficient noted by deg net ik . 8 x k 2 C i ; A 2 2 ; m k ns ( A ) : = \ s =1 ;:::;c m deg net ik s ( A )( x k ) (26) W e obtain the f aith in the elements of the frame of discernment. deg net ik is a measure of neatness of object x k relati v ely to cluster C i . Object x k may be clear or ambiguous for a gi v en cl uster . If it is in the center of a cluster or near to, it is considered a v ery good one. It represent s v ery well the cluster . W e can af firm that it belongs to only one cluster . If it is situated in the border(s) between tw o or man y clust ers it may not be considered as clear object for only one cluster . It is ambiguous. It may belong to more than only one group. The computation of deg net ik tak es account of tw o f actors: de gree of members h i p to cluster C i and the maximal de gree of o v erlapping in the present partition noted by S max . It is the maximal similarity in the partition (found by the clustering). deg net ik = 1 deg ov er l i (27) deg ov er l i is the o v erlapping de gree to cluster C i . It is computed as follo ws: deg ov er l i = (1 ik ) S max (28) De gree of neatness is the complement to 1 of the de gree of o v erlapping. It is composed of tw o terms: first one (1 ik ) measures the de gree of not membership of a point x k to a cluster C i . Second one tak es account of o v erlapping aspect. IJECE V ol. 7, No. 2, April 2017: 1071 1087 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 1077 S max measures the maximal o v erlapping in the partition. It is computed as follo ws: S max : = max( S ( C i ; C j )) (29) The clusters C i and C j are considered as fuzzy not hard sets. S ( C i ; C j ) = max x k 2 X (min( C i ( x k ) ; C j ( x k )) (30) Similarity measure is not based on distance measure due to its limits. In f act, we can find tw o clusters ha ving the same distance separating them b ut are not separable in the same w ay . It is based on membership de gree. W e look for the de gree of co-relation between tw o groups. What’ s the minimum le v el of co-relation guaranteed. The ne w measure satisfies the follo wing properties: Pr operty 1 : S ( C i ; C j ) is the maximum de gree between tw o clusters. Pr operty 2 : The similarity de gree is limited, 0 S ( C i ; C j ) 1 Pr operty 3 : If C i : = C j then S ( C i ; C j ) : = 1 and if ( C i \ C j : = ; ) then S ( C i ; C j ) : = 0 . Pr operty 4 : The measure is commutati v e, S ( C i ; C j ) : = S ( C j ; C i ) F or e xample, if S ( C i ; C j ) : = 0 : 4 so the tw o clusters are similar or in relation with minimum de gree of 0.4. The y are not connected with a de gree of 0.6. S max : = max( max x k 2 X (min( C i ( x k ) ; C j ( x k )))) (31) deg net ik : = 1 (1 ik ) S max (32) The de gree of membership of an object x k to a cluster C i is calculated as follo ws: ik : = c X l =1 ( ( k x k v i k ) (2 = ( m 1)) ( k x k v l k ) (2 = ( m 1)) ) 1 i : = 1 ; : : : ; c ; k : = 1 ; : : : ; n 1 (33) where v i the center of cluster C i , n 1 number of objects. F or the combination phase, we use the cautious rule 5. Sources are not totally independent because computation of masses for the unsupervised source is based on classes gi v en by supervised sources. So, we can not say that the y are independent. At the end, we decide using the pignistic probability . W e are interested only in singletons: labels gi v en by the classification. T o summarize the process of fusion we illustrate that by the follo wing figure: 4. RESUL T AND AN AL YSIS In this section, we present the obtai ned results for our fusion approach between supervised classification and unsupervised classification. W e conduct our e xperimental study on dif ferent databases coming from generic databases obtained from the U.C.I repository of Machine Learning databases. In future we intend to use real data base lik e medical imaging or sonar imaging. Firstly , we did e xperiments on data without an y change. in second time we edit our data and remo v e some information to mak e a data missing. Thirdly , we inject nois e with dif ferent rates and we tak e a little sampling database ( 10% ). The aim is to demonstrate the performance of the proposed method and the influence of the fusion on the classification results in a noisy en vironment and with missing data. The e xperience is based on three unsupervised methods such as the Fuzzy C-Means (FCM), the K -Means and the Mixture Model. F or the supervised methods, we use t he K -Nearest Neighbors, credal K -Nearest Neighbors, Bayes, decision tree, neural netw ork, SVM and credal neural netw ork. W e sho w in the T able 2 the obtained classification rates before and after fusion for the ne w mechanism. The data sho wn are: Iris, Abalone, Breast-cancer , Car , W ine, Sensor -readings24 and Cmc. Th e first ones (before fusion) are those obtained with only supervised methods ( K -Nearest Neighbors, credal Cr edal Fusion of Classifications for Noisy and Uncertain Data (Kar em) Evaluation Warning : The document was created with Spire.PDF for Python.
1078 ISSN: 2088-8708 Figure 1. Fusion mechanism K -Nearest Neighbors, Bayes, decision tree, neural netw ork, SVM and credal neural netw ork). The learning rate is equal to 10% . W e sho w in the T able 3 the obtained classification rates before and after fusion for the ne w mechanism for missing data. The data sho wn are: Iris, Abalone, W ine, Sensor -readings24 and Cmc. The first ones (before fusion) are those obtained with only supervised methods ( K -Nearest Neighbors, credal K -Nearest Neighbors, Bayes, decision tree, SVM and credal neural netw ork). The learning rate is equal to 10% . W e sho w in the T ables 4, 5, 6, 7, 8 and 9 the obtained classification rates before and after fusion for the ne w mechanism in a v ery noisy en vironment. W e v ary the IJECE V ol. 7, No. 2, April 2017: 1071 1087 Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE ISSN: 2088-8708 1079 noise le v els. W e sho w results obtained with the follo wing le v els: 55% , 65% and 70% respecti v ely for: IRIS, Abalone, Y east, wine, sensor -readings4 and sensor -readings2. 4.1. Experimentation The number of clusters may be equal to the number gi v en by the supervised classification or fix ed by the user . The tests conducted are independent for the three le v els of noise. It means that the y were not made in the same iteration of the program. In the follo wing, we present the data (table 1) and the results obtained (tables 2, 3, 4,5, 6, 7, 8 and 9). T able 1. Data characteristic NbA: Number of attrib utes, NbC: number of classes, NbCl: number of clusters tested Data NbA NbC NbCl Iris 5 3 3 Abalone 8 2 2 Breast-cancer 11 3 3 Car 6 4 4 W ine 13 3 3 Sensor -readings24 5 4 4 Sensor -readings2 2 4 4 Sensor -readings4 4 4 4 Y east 8 10 10 Cmc 9 3 3 4.2. Discussion If we look to the results sho wn in table 2. W e remark the follo wing results for each data: 1. Iris The performance obtained after fusion are equal to 100% e xception are for decision tree and neural netw ork no impro v ement. The classification rate is approximately 66% . 2. Abalone The performance obtained after fusion are better than that obtained before fusion e xception is for decision tree no impro v em ent. The classificati on rate is 31 : 28% . The best result obtained is for KNN with mixture model 97 : 58% . 3. Breast cancer The performance obtained after fusion are e qu a l to 100% (KNN, Bayes, decision tree, neural netw ork, credal KNN) e xception are for SVM and credal neural netw ork. The classification rate is approximately 65% . 4. Car The classification rate after fusion is better for most cases equal to 100% (KNN and credal KNN), 96% (Bayes), 92% (Decision tree). F or SVM, neural netw ork and credal neural netw ork the performance is less than that before fusion equal to 70% . 5. W ine The clas sification rates obtained after fusion are equal to 100% (KNN, Bayes, decision tree, neural netw ork, credal KNN), 73% for credal neural netw ork, approximately 40% for SVM. 6. Sensor -readings24 The classification rates obtained aft er fusion are equal to 100% (KNN, Bayes, decision tree, credal KNN, SVM) and to 99% for neural netw ork. Cr edal Fusion of Classifications for Noisy and Uncertain Data (Kar em) Evaluation Warning : The document was created with Spire.PDF for Python.
1080 ISSN: 2088-8708 T able 2. Classification rates obtained before and after fusion Data Iris Abalone Breast-cancer Car W ine Sensor -readings24 Cmc K N N 90.37 50.73 57.87 83.67 67.50 75.36 48.38 K N N + FCM 100 97.21 100 100 100 100 100 K N N + K- Means 100 78.13 100 100 100 100 100 K N N + Mixture model 100 97.58 100 100 100 100 100 Bay es 94.81 50.65 94.91 76.53 89.38 61.61 47.40 Bayes + FCM 100 62.89 100 96.27 100 100 100 Bayes + K-Means 100 62.92 100 96.27 100 100 100 Bayes + Mixture model 100 63.39 100 96.27 100 100 100 Decision tr ee 66.67 31.28 93.32 74.92 64.38 94.13 37.06 Decision tree + FCM 66.67 31.28 100 92.22 100 100 100 Decision tree + K- Means 66.67 31.28 100 92.22 100 100 100 Decision tree + Mixture model 66.67 31.28 100 92.22 100 100 100 Neural netw ork 64.44 53.02 95.23 70.10 63.13 72.10 39.17 Neural netw ork + FCM 66.67 79.04 100 70.03 100 99.76 65.28 Neural netw ork + K-Means 66.67 83.51 100 70.03 100 99.31 65.28 Neural netw ork + Mixture model 66.67 72.44 100 70.03 100 99.63 65.28 Cr edal K N N 94.81 49.88 60.25 82.57 74.38 75.82 44.15 Credal K N N + FCM 100 56.90 100 100 100 100 100 Credal K N N + K-Means 100 57.62 100 100 100 100 100 Credal K N N + Mixture model 100 55.60 100 100 100 100 100 SVM 93.33 52.86 65.50 70.35 39.38 52.77 43.09 SVM + FCM 100 65.02 65.50 70.10 38.75 100 54.87 SVM + K-Means 100 66.37 65.50 70.10 40.00 100 55.55 SVM + Mixture model 100 66.45 65.50 70.10 33.13 100 61.43 Cr edal Neural Netw ork 96.30 53.31 65.66 73.70 66.88 64.01 45.96 Credal Neural Net- w ork + FCM 100 62.52 65.66 70.03 73.13 100 99.25 Credal Neural Net- w ork + K-Means 100 60.52 65.66 70.03 73.13 100 99.17 Credal Neural Netw ork + Mixture model 100 57.81 65.66 70.03 73.13 100 99.77 7. Cmc W e obtain 100% in most cases (KNN, Bayes, decision tree, credal KNN), 99% for credal neural netw ork, 65% IJECE V ol. 7, No. 2, April 2017: 1071 1087 Evaluation Warning : The document was created with Spire.PDF for Python.