4.7 Article Proceedings Paper

Half-space mass: a maximally robust and efficient data depth method

期刊

MACHINE LEARNING
卷 100, 期 2-3, 页码 677-699

出版社

SPRINGER
DOI: 10.1007/s10994-015-5524-x

关键词

Half-space mass; Mass estimation; Data depth; Robustness

资金

  1. U.S. Air Force Research Laboratory [FA2386-13-1-4043]
  2. JSPS KAKENHI [25240036]
  3. National ICT Australia (NICTA) Machine Learning Collaborative Research Projects
  4. Faculty of Information Technology, Monash University
  5. Grants-in-Aid for Scientific Research [26540116, 25240036] Funding Source: KAKEN

向作者/读者索取更多资源

Data depth is a statistical method which models data distribution in terms of center-outward ranking rather than density or linear ranking. While there are a lot of academic interests, its applications are hampered by the lack of a method which is both robust and efficient. This paper introduces Half-Space Mass which is a significantly improved version of half-space data depth. Half-Space Mass is the only data depth method which is both robust and efficient, as far as we know. We also reveal four theoretical properties of Half-Space Mass: (i) its resultant mass distribution is concave regardless of the underlying density distribution, (ii) its maximum point is unique which can be considered as median, (iii) the median is maximally robust, and (iv) its estimation extends to a higher dimensional space in which the convex hull of the dataset occupies zero volume. We demonstrate the power of Half-Space Mass through its applications in two tasks. In anomaly detection, being a maximally robust location estimator leads directly to a robust anomaly detector that yields a better detection accuracy than half-space depth; and it runs orders of magnitude faster than depth, an existing maximally robust location estimator. In clustering, the Half-Space Mass version of K-means overcomes three weaknesses of K-means.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据