Research on spatial outlier mining algorithm based on distributed computing
Wangjun He and Agen Qiu
Chinese Academy of Surveying and Mapping, China
: J Comput Eng Inf Technol
Abstract
In view of the existing spatial outlier mining algorithms which cannot adapt to the needs of large-scale spatial data mining, this paper presents a spatial outlier mining algorithm based on distributed system. Firstly, this paper proposes the use of space filling curve to partition the data set, and speed up the nearest neighbor search of the target point. Secondly, using the theory of information entropy to define the spatial outlier factor, taking into account the impact of different attributes of multidimensional data on the outliers, the algorithm can automatically calculate the weight of each attribute according to the original features of the data. At the same time, the influence of spatial factors on the outlier factor is defined by the inverse distance weight. Experiments show that the efficiency of this algorithm is much higher than that of the traditional algorithm, and the accuracy of outlier mining is more than 90 percent.
Biography
Wangjun He received his MSc degree in Geographic Information System from Chinese Academy of Surveying and Mapping in the year 2012. He received his Bachelor’sdegree from Tongji University in 2009. Now he is an Assistant Professor at the Chinese Academy of Surveying and Mapping. He has published more than 8 papers in various journals. His research interests are in the areas of Spatial Data Visualization and Government Geographic Information Service.
Email: Hewj@casm.ac.cn