TELK OMNIKA T elecommunication, Computing, Electr onics and Contr ol V ol. 18, No. 4, August 2020, pp. 1884 1891 ISSN: 1693-6930, accredited First Grade by K emenristekdikti, No: 21/E/KPT/2018 DOI: 10.12928/TELK OMNIKA.v18i4.13858 r 1884 Spatial association disco v ery pr ocess using fr equent subgraph mining Gio v anni Dai ´ an Rottoli 1 , Her n ´ an Merlino 2 1 Uni v ersidad Nacional de La Plata, Ar gentina 1 Uni v ersidad T ecnologica Nacional, Ar gentina 1,2 Information Systems Research Group, National Uni v ersity of Lan ´ us, Buenos Aires Article Inf o Article history: Recei v ed Aug 10, 2019 Re vised Mar 10, 2020 Accepted Apr 3, 2020 K eyw ords: Frequent subgraph mining SARM Spatial association mining Spatial data mining Spatial kno wledge disco v ery ABSTRA CT Spatial associations are one of the most rele v ant kinds of patterns used by b usiness intelligence re g arding spatial data. Due to the characteristics of thi s particular type of information, dif ferent approaches ha v e been proposed for spatial association mining. This wide v ariety of methods has entailed the need for a process to inte grate the ac- ti vities for association disco v ery , one that is easy to implement and fle xible enough to be adapted to an y particular situation, particularly for small and medium-size projects to guide the useful pattern disco v ery process. Thus, this w ork proposes an adaptable kno wledge disco v ery process that uses graph theory to model dif ferent spatial rela- tionships from multiple scenarios, and frequent subgraph mining to disco v er spatial associations. A proof of concept is presented using real data. This is an open access article under the CC BY -SA license . Corresponding A uthor: Gio v anni Dai ´ an Rottoli, Departamento de Ingenier ´ ıa en Sistemas de Informaci ´ on, Uni v ersidad T ecnol ´ ogica Nacional, F .R. Concepci ´ on del Uruguay , 676 Ing. Pereira Street, Concepci ´ ıon del Uruguay (3260), Entre R ´ ıos, Ar gentina. Email: rottolig@frcu.utn.edu.ar 1. INTR ODUCTION Spatial kno wledge di sco v er y aims to find usef u l and no v el patterns in spatial datasets to support decision-making in a particular problem domain [1]. Among all the possible patterns to disco v er , spatial asso- ciations are one of the most commonly us ed today in multiple fields such as climatology , geograph y , geology , criminology and ecology , among man y others. The y are comprised of predicates that in v olv e spatial objects along with spatial and non-spat ial relationships between those objects [2]. There are man y challenges associ- ated with the characteris tics of spatial data that mak e this data mining task more complicated, such as the spatial dependenc y data attrib utes, the multiplicity of spatial data representation models, the spatial relations between data objects and some particular spatial properties s uch as spatial autocorrelation and spatial heterogeneity [3]. Multiple algorithms ha v e been de v eloped for association pattern mining that can be used. Each of these algorithms, in general, aims to solv e particular concerns about the aforemetioned challenges. The se- lection of a proper algorithm has become an arduous acti vity due to the gro wing number of ne w alternati v es and their v ariants, specially to ine xperienced users. Thus, it is necessary to pro vide a ne w process for small or medium-size application domains, one that is easy to implement and fle xible enough to be adapted to multiple conte xts. Consequently , this paper proposes a ne w process for association mining disco v ery from spatial data J ournal homepage: http://journal.uad.ac.id/inde x.php/TELK OMNIKA Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control r 1885 that utilizes graph theory to model spatial objects and the relations between them and frequent subgraph mining to find the substructures with a high repetition rate inside the general graph. These substructures correspond to association patterns. The proposal is a ne w alternati v e to model comple x sit u a tions from a particular problem domain, b ut not replace or impro v e results from the algorithms in the state-of-the-art, ho we v er it pro vides a road map to initially address a problem. The rest of this paper is arranged as follo ws: section 2. on the charac- teristics of spatial data; section 3. contains association patterns and their characteristics re g arding spatial data; section 4. includes the proposed process for disco v ery of spatial associations; a proof of concept using real w orld data is sho wn in section 5. Lastly , section 6. contains conclusions and future w orks . 2. SP A TIAL D A T A Spatial data is a particular type of dependent data. F ormally , a spatial database D is a set of spati al records D = f ~ T 1 ; ~ T 2 ; ; ~ T d g with T i = f S 1 i ; S 2 i ; ; S m i ; X 1 i ; X 2 i ; ; X n i g , where each S k i is a spatial attrib ute that stores v alues about the spatial conte xts, and each X l i is a non-spatial attrib utes with v alues mea- sured at particular locations [3, 4]. The non-spatial attrib utes may be nu m erical or cate gorical according to the problem domain and the spatial attrib utes may be specified as coordinates or places (e.g. city name or state code). Additionally , there are three basic types of spatial objects: points, used to model specific punctual locations in the space; lines, used to model linear e xtensions such as ri v ers or roads; and polygons, used to represent objects that ha v e a tw o-dimensional e xtension in the space, such as re gions or states. The dependence of non-spatial attrib utes on spatial ones means that dif ferent implicit spatial relati ons can be e xtracted from data. Let D be a spatial database, a relation R D 2 is called spatial if and only if it is defined through a binary predicate P ( x; y ) j x; y 2 D that in v olv es the spatial attrib utes from the spatial objects x and y . F or e xample, the spatial relation N D 2 , with x; y 2 D , defined by the predicate sho wn in (2), is the neighborhood relations between tw o spatial points using euclidean distance: xN y ( ) D ist ( x; y ) < ; 2 R + These relations can be classified as g eometric , if the y are related to the principles of euclidean geom- etry (e.g. neighbouring relationships); dir ectional , when the y refer t o relati v e spatial orientations (e.g. abo v e, belo w , north, east); topolo gical , if the y are independent from the concepts of distance and direction and are not af fected by spatial transformations such as rotation or translation (e.g. intersect, inside), or hybrid , if the y are related to tw o or more of the aforementioned types of properties. These relationships can be calculated using dif ferent methods depending on the problem domain and the class of spatial data used: points, lines or polygons [5, 6]. On the other hand, tw o properties are deri v ed from spatial dependence: spatial autocorrelation, i.e., observ ations of spatially distrib uted random v ariables are not location-independent, and spatial heterogeneity , i.e., patterns found in some re gion of the space may not ha v e the same support in other re gion. Spatial auto- correlation refers to the particularity of spatial data to not be distrib uted independently throughout the space. The distrib ution depends on the characteristics of the data points, the characteristics of the underlying space or the spatial neighboring relationships. F or e xample, churches tends to be located near public squares or animal tends to tra v el to locations that contain their food sources [7]. Spatial heterogeneity is related to spatial auto- correlation. This phenomenon describes the local nature of spatial patterns, which are subordinated to some specific locations. Thus, a spatial pattern, such as association rules, may ha v e a high support v alue in a re gion and a lo w support v alue in a dif fer ent one. This phenomenon is also kno wn as Simpson’ s paradox [8]. All these particular characteristics mak e kno wledge e xtraction from spati al data become a comple x acti vity which not only has to consider patterns between data records, b ut also the implicit relationships between spatial objects. 3. SP A TIAL ASSOCIA TIONS One of the most common patterns to find in data is the association pattern. An association pattern P is defined as an n-ary predicate P = ( p 1 ; p 2 ; ; p n ) with a high probability of occurrence in the dataset. Its classic application is the supermark et bask et analysis to disco v er whether or not there is some correlation between items that are bought together . An association pattern is referred to as spatial if at least one of its atomic predicates p k in v olv es a spatial relationship between its v ariables [2]. F or e xample, in a city C, churches and public squares tend to be neighbors: C ity ( C ) ^ C hur ch ( X ) ^ P ubl ic S q uar e ( Y ) ^ I nside ( X ; C ) ^ I n s ide ( Y ; C ) ^ N eig hbor s ( X ; Y ) As sho wn in the pre vious e xample, I nside ( X ; C ) , I nside ( Y ; C ) and N eig hbor s ( X ; Y ) are spatial Spatial association disco very pr ocess using fr equent subgr aph mining (Gio vanni Dai ´ an Rottoli) Evaluation Warning : The document was created with Spire.PDF for Python.
1886 r ISSN: 1693-6930 predicates related to topological and geometric relationships. Man y dif ferent relations must be tak en into consideration at the same time to find useful spatial associations. Also, these relations must be calculated in local conte xts, due to the aforementioned Simpson’ s P aradox. Multiple ef forts ha v e been made in order to find spatial association patterns in spatial databases: [7] proposes a method for spatial association mining that consider spatial autocorrelation by using a cell structure; [9] focuses on the problem of rule e xtraction from spatial data with crisp condition attrib utes and fuzzy deci- sions. A rough-fuzzy set based rule e xtraction model is used to deal with both fuzziness and roughness ; [10] combines and e xtend techni ques de v eloped in both spati al and fuzz y data mi ning to dea l with the uncertainty found in typical spatial data. This proposal uses fuzzy logic to get rele v ant information from transition areas between spatial neighborhoods to spatial association mining and for spatial relationships modelling; [11, 12] propose an algorithm for local patterns disco v ery considering spatial heterogeneity that incorporates a no v el spatial metric for support e v aluation based on e v ent density in a particular area; [13] presents a specially de- signed algorithm to disco v er spatial associations related to El Ni ˜ no Southern Oscillation (ENSO); [14] applies an algorithm that e xplores multiple spatial objects hierarchies; [15] uses A-Priori-based approaches to find spatial association rules; [6, 16] propose using Induct i v e Logic Programming (ILP) for reach this data mining purpose by modelling and stracting high support spatial relations from spatial data. [17] w ork ed with meta- heuristics s uch as genetic algorithms and e v olutionary programming; [18] suggested a data-transformation approach before using traditional association rule mining algorithms; [19] introduced non-tri vial structures such as graphs for spatial relationship representation; among others. Because of this v ariety of spatial data mining approaches for association disco v ery , it is dif ficult to select a proper algorithm or method to be used in small kno wledge disco v ery application conte xts. Because of this, a unified and general process is required to deal with the aforementioned problems and it has to be fle xible enough to be adapted to multiple particular situations and easy to implement. 4. SP A TIAL ASSOCIA TION DISCO VER Y PR OCESS This w ork describes a ne w process for spatial association e xtraction considering the possibility of ha ving multiple relationships between spatial objects of an y kind (i.e. points, lines, polygons), and considering the spatial autocorrelation and spatial heterogeneity . This process is designed as a first approach to get spatial association kno wledge from data in particular conte xts easy to implement in small or medium-size projects. The process Figure 1 is di vided into 5 main steps: data preparation (section 4.1.), neighborhood definition (section 4.2.), modelling of spatial relationships using graphs (section 4.3 .) , frequent subgraph mining (section 4.4.) and e v aluation of results (section 4.5.). Figure 1. Spatial association disco v ery process 4.1. Data pr eparation The proposed process starts with a s p a tial data preparation step. It is necessary to codify the v arious spatial datasets obtained from dif ferent sources in dif ferent formats, in order to enable t he e xtraction of relations between all the data instances in later steps. In general terms, it is not uncommon to ha v e multiple spatial objects layers, each of them with a particular representation type and related to a particular scenario from the problem domain. On the other hand, tw o types of datasets must be considered: tar get datasets, with objects directly TELK OMNIKA T elecommun Comput El Control, V ol. 18, No. 4, August 2020 : 1884 1891 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control r 1887 related to the problem domain that are going to be present in e v ery association pattern, and rele v ant datasets, that may or may not be related to the tar get datasets, b ut add important information that may be useful for decision making [20]. These data must be prepared by cleaning errors, solving inconsistent and null v alues, and dealing with outliers. Ne w attrib utes or e v en ne w data objects could be generated using the input data. This step requires considerable ef fort and may require man y iterations. Thus, it is advisable to implement the process using a proper methodology such as CRISP-DM [21]. 4.2. Neighborhood definition As mentioned before, a parti cular spatial association pattern may ha v e a higher occurrence probabili ty in some re gions and lo wer probability in others [8]. F or this reason it is preferred to search for this kind of pattern locally . F or this, we propose defining partitions of the dataset, called neighborhoods in this conte xt, and the subsequent e x ecution of the association pattern search algorithm on each of them. These neighborhoods can be defined beforehand using kno wledge from to the problem domain, or using spatial clustering techniques. Using density-based or distance-based spatial clustering algorithms [22– 24] is suggested due to the First La w of Geograph y , which stat es that spatial objects located together are more closely related than those that are f ar a w ay from each other [25, 26]. Nonetheless, there is an issue to consider in this step: the limits between neighborhoods may add important information for spatial association mining. Thus, the use of fuzzy clustering techniques or fle xible boundaries models may be desirable. 4.3. Modelling of spatial r elationships using graphs No w , we ha v e to calculate the spatial relations between the tar get data instances and the insta n c es of the rele v ant dataset from each neighborhood. Depending on the problem domai n, dif ferent types of spatial relations can be calculated: euclidian, topological, directional or h ybrid relationships, as mentioned abo v e [6]. This might be a step with a high computational cost. Graph theory is proposed to model t he spatial relationships due to its close relation with first order logic and the pattern to find [16]. Graphs are discrete structures consisting of v ertices and edges that connect these v ertices. There are dif ferent kinds of graphs, depending on whether edges ha v e directions (digraphs), whether multiple edges can connect the same pair of v ertices (multigraphs), and whether loops are allo wed. F ormally , a simple gr aph G = (V , E) consists of V , a nonempty set of v ertices (or nodes) and E, a set of edges. Each edge has tw o v ertices associated with it, called its endpoints. An edge is said to connect its endpoints. T o relate each edge to its endpoints, a function : E ! f v 1 2 V ; v 2 2 V g , called incidence function, is used. A multigraph, on the other hand, is a graph where multiple edges can e xist associated with the same endpoints. Additionally , each v erte x and each edge can be labeled with data related to the represented object. This structure can be adapted to multiple scenarios and multiple ef ficient algorithms can be used to e xtract v aluable information such as maximum cliques [27]. In the conte xt of this w ork, multigraphs are used to model spatial objects as v ertices and the rel ations between them as edges. A small e xample can be seen in Figure 2 (a). T w o sets of labels and tw o e xtra functions to asign those labels to the v ertices and edges are needed. So, let G be a multigraph without loops G = ( V ; E ; L; K ; ; l ; k ) where: V is the v erte x set of G , which corresponds to the spatial objects from the datasets; E is the edge set that corresponds to each calculated relationship between the spatial objects; L is the v erte x label set with the characteristics of the spatial data objects; K is the edge label set, with the charac teristics of each spatial relation; : E ! f x 2 P ( V ) = j x j 2 g is the incidence function; l V L and k E K are labeling relations. The aforementioned structure mak es it possible to model multiple dif ferent relationships with the same endpoints labeled with dif ferent attrib utes. Also, man y attrib utes of spatial objects could be tak en into consideration. Additionally , it must be noted that loops (i.e. edges with only one endpoint) are not considered because t heir lack of semantics in this conte xt (there are not spatial relationships that in v olv es only one spatial object). Fuzzy logic could also be a v aluable tool to model the spatial relationships, if the situation requires it [10]. More information about fuzzy logic this can be found in [28] 4.4. Fr equent subgraph mining T o e xtract spatial associations wit h a high probability of occurrence, frequent subgraph mining is pro- posed to be used for each modeled graph. Gi v en a multigraph G = ( V ; E ; L; K ; ; l ; k ) lik e the one described in the pre vious section, the frequent subgraph mining problem in a single multigraph is finding recurring subgraph Spatial association disco very pr ocess using fr equent subgr aph mining (Gio vanni Dai ´ an Rottoli) Evaluation Warning : The document was created with Spire.PDF for Python.
1888 r ISSN: 1693-6930 G i G , or in other w ords, a subgraph that has multiple instances in the original graph Figure 2 (b). It must be noted that tw o graphs are isomorphic if all of their v ertices and edges are shared including its labels.These frequent subgraphs represent the rela tionships between spatial object types that tak e place in the space with a high occurrence probability . Multiple algorithms ha v e been designed for frequent subgraph mining in a single big graph, calcu- lating the rele v ance of a pattern in dif ferent w ays. Some well-kno wn e xamples of this are IncGM+, FSSG, SUBDUE, among others [29, 30]. A set of frequent subgraphs for each neighborhood is obtained as a result of this step and must be analyzed to obtain useful kno wledge for decision-making. 4.5. Ev aluation of r esults In the final step, frequent subgraphs translated into n-ary predicates that represent tri vial information (non-no v el patterns) must be filtered. The support and confidence measures can be e xtracted, selecting the metrics that the desicion-mak er consider to be more appropiate. This acti vity could be performed automatically or manually by an analyst with kno wledge about the problem domain with help from an e xpert. P ubl icS q uar eX 1 C ity A C h ur chY 1 I ncl ude I n c l ude N eig hbor s (a) (b) Figure 2. (a) Simplified e xample of spatial relationship modelling using a simple graph. (b) Example of frequent subgraph (bottom) found in a simple graph without labels in the edges. 5. PR OOF OF CONCEPT The proof of concept presented in this section is intended to sho w ho w the proposed process w orks, implemented by dif ferent programming and data mining tools. The data used in this e xample consists of 10 data files containing the location of f acilities in Buenos Aires (Ar gentina) and its surroundings. These f acilities include libraries(74), clinics(63), post of fices(55), sports halls(50), nightclubs(41), schools(107), g as stations (97), churches (125), museums(37) or police stations (93). F or each of them, in the preparation step of the proposed process, the data files were inte grated into a single data file of spatial points using QGis (http://qgis.or g/). Each spatial point is comprised of tw o spatial attrib utes, Latitude and Longitude, and one non-spatial attrib ute, the type of b uilding from the pre vious list. After that, only the points that are located outside Buenos Aires limits were filtered to reduce the search space, lea ving 742 spatial points Figure 3 (a), (orange). Then, in the neighborhood definition step, the HDDBSCAN clustering algorithm [31] from the dbscan’ library from R programming language w as used on the spatial data attrib utes to generate tw o neighborhoods with a minimum number of points equal to 50 in each of them Figure 3 (a), (blue). Only tw o neighborhoods were used because of e xplanatory purposes. In the ne xt step, for each of the generated neighborhoods, a geome tric relationship between their data points w as e xtracted forming a graph with v ertices labeled wit h the type of f acility related to each data point and edges labeled with the sentence ”close to” if the adjacent points were less than 150 meters a w ay from each other (this v alue w as selected for illustrati v e purposes only). Thus, tw o graphs were created: one with 71 v ertices and 45 edges in neighborhood 1, and another with 15 v ertices and 11 edges in neighborhood 2. T o ob t ain the frequent subgraphs of each of the generated graphs, SUBDUE algorithm w as used via its implementation in Subdue Graph Miner Softw are, using the compression rate as support measure. TELK OMNIKA T elecommun Comput El Control, V ol. 18, No. 4, August 2020 : 1884 1891 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control r 1889 The result w as a subgraph as sho wn in Figure 3 (b), with a compression rate of 15.5% in neighborhood 1, which w as translated into the predicate Post of fice ( x 1 ) ^ Nightclub ( x 2 ) ^ Close to ( x 1 ; x 2 ) and tw o sub- graphs in neighborhood 2 , both with a compression rate of 27.2% that w as translated into the predicates Clinic ( x 1 ) ^ Post of fice ( x 2 ) ^ Close to ( x 1 ; x 2 ) Post of fice ( x 1 ) ^ Sport hall ( x 2 ) ^ Close to ( x 1 ; x 2 ) (a) (b) Figure 3. (a) Spatial neighborhoods generated for the proof of concept using HDBSCAN algorithm; (b) Results of the proof of concept. 5.1. Discussion The contrib utions of the proposed process are, firstly , the possibility of adapting it to multiple s cenar - ios, due to its fle xible underlying struc ture being based on graphs. Some of the aforementioned methods use fle xible structures too [6, 16] b ut the comple xity of these methods increases because of the use of techniques based on Logic Programming. On the other hand, some other methods do not tak e int o account comple x pat- terns [19]. Furthermore, the possibility of including v aluable information related to the data objects and the spatial relations by using labels in the graph representation is also considered. Generally , the data structures in v olv ed do not tak e into account comple x data associated to the spatial relations between spatial data. In relation to the abo v e, t he proposed process consi d e rs spatial phenomena such as autocorrelation and heterogeneity , by using spatial neighborhoods. Some alternati v es such as [7] considering spatial autocorrelation b ut not considering spatial heterogeneity or comple x data relationships. In most of the cases studied, these characteristics are present due to their rele v ance in data mining. Also, related to t his, the proposed process allo ws its implementation by using e xisting tools such as frequent subgraph mining algorithms and clustering algorithms. Some of the stat e-of-the-art alternati v es include v ery fle xible and po werful strate gies, b ut implementation is hard, making them not suitable for appli- cation in small or medium size projects [6, 9, 16, 19]. Lastly , the high adaptability of the procedure is a desired characteristic due to the possibility of selecting among man y algorithms for the implementation of each step. Usually , the state-of-the-art methods propose a single alternati v e for its e x ecution. 6. CONCLUSION This w ork describes a kno wledge disco v ery process called for e xtraction of spatial associat ions. The process is fle xible enough to tak e into account multiple and v aried spatial relationships between spatial objects of an y kind, using a graph structure to model them. Heterogeneity and autocorrelation phenomena are also considered, defining neighborhoods where the search process is performed t o find this class of re gularity . The solution w as designed to initially approach to this data mining task without w orrying too much about par - ticular characteristics of data mining algorithms. In a lar ge-scale project, this process could guide the selection of specific methods based on the results obtained in first iterations of an incremental methodology . A proof of concept is presented as well, using real data to illustrate ho w the process is implemented using dif ferent programming and data mining tools in each of the proposed steps. Spatial association disco very pr ocess using fr equent subgr aph mining (Gio vanni Dai ´ an Rottoli) Evaluation Warning : The document was created with Spire.PDF for Python.
1890 r ISSN: 1693-6930 In future w orks, the research will be focused on implementation strate gies according to the problem domain for each of the steps of the process, in order to decrease computational e x ecution time when dealing with lar ge amounts of spatial objects and spatial relationships. Also, fuzzy methods will be considered for relation modelling and neighborhood definition. A CKNO WLEDGEMENT The research presented in this paper w as partially funded by the PhD Scholarship Program to reinforce R&D&I areas (2016-2020) of the Uni v ersidad T ecnol ´ ogica Nacional and the Research Project 80020160400001 LA of Nat ional Uni v ersity of Lan ´ us. The authors also w ant to e xtend their gratitude to K e vin-Mark Bozell Poudereux, for proofreading the translation. REFERENCES [1] R. Garcia-Martinez, P . Britos, and D. Rodriguez, “Information mining processes based on intelligent sys- tems, International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp. 402-410, 2013. [2] K. K operski and J. Han, “Disco v ery of spatial associati on rules in geographic information databases, International Symposium on Spatial Databases, pp. 47–66, 1995. [3] Y . Leung et al., ”Kno wledge disco v ery in spatial data, Springer , 2010. [4] C. C. Agg arw al, ”Data mining: The te xtbook, Springer , 2015. [5] R. Agra w al, et al., “F ast algorithms for mining association rules, Proc. 20th int. conf. v ery lar ge data bases, v ol. 1215, pp. 487-499, 1994. [6] A. Appic e, M. Ceci, A. Lanza, F . A. Lisi, and D. Malerba, “Disco v ery of spatial association rules in geo-referenced census data: A relational mining approach, Intelligent Data Analysis, v ol. 7, no. 6, pp. 541-566, 2003. [7] J. Chen, An algorithm about association rule mining based on spatial autocorrelation, The International Archi v es of the Photogrammetry , Remote Sensing and Spatial Information Sciences, v ol. 37, no. B6b, pp. 99-106, 2008. [8] E. H. Simpson, “The interpret ation of interaction in contingenc y tables, Journal of the Ro yal Statistical Society . Series B (Methodological), v ol. 13, no. 2, pp. 238–241, 1951. [9] H. Bai, Y . Ge, J. W ang, D. Li, Y . Liao, and X. Zheng, A method for e xtracting rules from spatial data based on rough fuzzy sets, Kno wledge-Based Systems, v ol. 57, pp. 28-40, 2014. [10] R. Ladner , F . E. Petry , and M. A. Cobb, “Fuzzy set approaches to spatial data mining of association rules, T ransactions in GIS, v ol. 7, no. 1, pp. 123-138, 2003. [11] Z. Sha and X. Li, “Mining local association patterns from spatial dataset, Se v enth International Confer - ence on Fuzzy Systems and Kno wledge Disco v ery (FSKD), v ol. 3, pp. 1455-1460, 2010. [12] Z. Sha, X. T an, and Y . Bai, “Localized spatial association: A case study for understanding v e getation successions in a typical grassland ecosystem, Geo-Informatics in Resource Management and Susta inable Ecosystem, pp. 33-45, 2015. [13] X. Cunjin and L. Xiaohan, “No v el algorithm for mining ENSO-oriented marine spatial association pat- terns from raster -formatted datasets, ISPRS International Journal of Ge o - Information, v ol. 6, no. 5, pp. 1-15, 2017. [14] A.Salleband and C.Vrain, Anapplication of associat ion rules disco v ery to geographic information sys- tems, European Conference on Principles of Data, pp. 613-618, 2000. [15] S. S. U. Sutjipto, I. S. Sitangg ang, and B. Barus, “Potential usage estimation of ground w ater using spatial association rule mining, TELK OMNIKA T elecommunication, Computing, Electronics and Control, v ol. 15, no. 1, pp. 504-511, 2017. [16] D. Malerba, F . Es posito, F . A. Lisi, and A. App i ce, “Mining spatial association rules in census data, Research in Of ficial Statistics, v ol. 5 no. 1, pp. 19-44, 2003. [17] A. H. Goudarzi and N. Ghadiri, A h ybrid spatial data mining approach based on fuzzy topological rela- tions and moses e v olutionary algorithm, Artificial Intelligence, Cornell Uni v ersity 2017. [18] I. Lee, “Mining multi v ariate associations within gis en vironments, International Conference on Indus- trial, Engineering and Other Applications of Applied Intelligent Systems, pp. 1062-1071, 2004. TELK OMNIKA T elecommun Comput El Control, V ol. 18, No. 4, August 2020 : 1884 1891 Evaluation Warning : The document was created with Spire.PDF for Python.
TELK OMNIKA T elecommun Comput El Control r 1891 [19] H. Y ang, S. P arthasarath y , and S. Mehta, “Mining spatial object associations for sc ientific data, II Inter - national Joint Conference on Artificial Intelligence (IJCAI), pp. 902-907, 2005. [20] V . Bogorn y , P . M. Engel, and L. O. Alv ares, “Geoarm: an interoperable frame w ork to impro v e geographic data preprocessing and spatial association rule mining. SEKE, pp. 79-84, 2006. [21] R. W irth and J . Hipp, “Crisp-dm: T o w ards a standard process model for data mining, Proceedings of the 4th international conference on the practical applications of kno wledge disco v ery and data mining, pp. 29-39, 2000. [22] J. Sander , et al., “Density-based clustering in spatial databases: The algorithm gdbscan and its applica- tions, Data mining and kno wledge disco v ery , v ol. 2, no. 2, pp. 169-194, 1998. [23] Y . Zhu, K. M . T ing, and M. J. Carman, “Density-ratio based clustering for disco v ering clusters with v arying densities, P attern Recognition, v ol. 60, pp. 983-997, 2016. [24] A. Sharma, R. Gupta, and A. T iw ari, “Impro v ed density based spatial clustering of applications of noise clustering algorithm for kno wledge disco v ery in spatial data, Mathematical Problems in Engineering, v ol. 2016, 2016. [25] W . R. T obler , “Cellular geograph y , Philosoph y in geograph y , pp. 379-386, 1979. [26] J. Duan, L. W ang, and X. Hu, “The ef fect of spatial autocorrelation on spatial co-location pattern mining, International Conference on Computer , Information and T elecommunication Systems, pp. 210-214, 2017. [27] G. D. Rottoli, H. Merlino, and R. Garc ıa-Martinez, “Co-location rules disco v ery process focused on ref- erence spatial features using decision tree learning, International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp. 221-226, 2017. [28] D. J. Dubois, ”Fuzzy sets and systems: theory and applications, Academic press, v ol. 144, 1980. [29] E. Abdelhamid, M. Canim, M. Sa d oghi , B. Bhattac h a rjee, Y . Chang, and P . Kalnis, “Incremental frequent subgraph mining on lar ge e v olving graphs, IEEE T ransactions on Kno wledge and Data Engineering, v ol. 29, no. 12, pp. 2710-2723, 2017. [30] D. Ka vitha, V . Kamakshi, and J. Murth y , “Finding frequent subgraphs in a single graph based on symme- try , International Journal of Computer Applications, v ol, v ol. 146, no. 11, pp. 0975-8887 2016. [31] R. J. Campello, D. Moula vi, A. Zimek, and J. Sander , “Hierarchical density estimat es for data clustering, visualization, and outlier detection, A CM T ransactions on Kno wledge Disco v ery from Data (TKDD), v ol. 10, no. 1, pp. 5, 2015. BIOGRAPHIES OF A UTHORS Gio v anni Dai ´ an Rottoli is a researcher at the Computational Intelligence and Softw are Engineering Research Group (GIICIS) from the National Uni v ersity of T echnology (Ar gentina). He holds a Bach- elor ´ s De gree in Information Systems from the aforementioned uni v ersity (2015). He is currently a Ph.D. candidate in Computer Science at the National Uni v ersity of La Plata (Ar gentina). He w orks as an associate professor of Discrete Mathematics and Da ta Science at the National Uni v ersity of T echnology (Ar gentina).His research is focused in the fields of spatial data mining and kno wledge disco v ery , artificial intelligence and search-based softw are engineering. Further info can be found on his profile: http://www .frcu.utn.edu.ar/giicis/rottolig/ Her n ´ an Merlino is the head of the Adv anced Information Syste ms Laboratory at Buenos Aires Uni- v ersity (Ar gentina) and the head of the Artificial Intelligence Laboratory at the National Uni v ersity of Lan ´ us (Ar gentina). He is a fello w of the Gas and Petroleum Institute at Buenos Aires Uni v ersity . He holds a Bachelor’ s De gree in Informat ion Systems from the Uni v ersity of Belgrano (Ar gentina), a Master’ s De gree in Softw are Engineering from the Computer Science Department of the Polytechnic Uni v ersity of Madrid (Spain), and a Ph.D. in Information Sciences from the National Uni v ersity of La Plata (Ar gentina ). He w orks as a tenured full professor in graduate and postgraduate courses at Buenos Aires Uni v ersity , Austral Uni v ersity and Na tional Uni v ersity of Lan ´ us. His research inter - ests are: artificial intelligence, data mining, and blockchain technologies. In the professional field, he w orks as a Scientific Res earch Director in an Artificial Intelligence, Data Science, Blockchain & Smart Contracts compan y in Ar gentina. Spatial association disco very pr ocess using fr equent subgr aph mining (Gio vanni Dai ´ an Rottoli) Evaluation Warning : The document was created with Spire.PDF for Python.