Outlier Mining Based on Neighbor-Density-Deviation with Minimum Hyper-Sphere
Outlier mining is to find exceptional behaviors of objects that deviate from the rest of the dataset or do not satisfy the common patterns. This paper introduces a density definition using the minimum hyper sphere and proposes an outlier mining algorithm based on neighbor-density-deviation. First, the definition of local space-density of an object is proposed by using the minimum hyper sphere. Second, the nearest neighbor sequence (NNS) based on the distance between an object and the neighbors of the object is established. After getting the space-density and the NNS of the object, the neighborhood density deviation (NDD) in NNS can be calculated based on the sum of density difference between the object and its neighbors. Finally, the neighbor-density-deviation-based outlier factor (NDDOF) is obtained to indicate the degree of the object being an outlier. To evaluate the effectiveness and the performance of the novel definition of space density and the NDDOF algorithm, we experiment on a synthetic dataset and three real UCI datasets. The results verify that the space-density is meaningful and the NDDOF algorithm has higher quality in outlier mining.