4.3 Article

EFFECT OF HEAVY TAILS ON ULTRA HIGH DIMENSIONAL VARIABLE RANKING METHODS

Journal

STATISTICA SINICA
Volume 22, Issue 3, Pages 909-932

Publisher

STATISTICA SINICA
DOI: 10.5705/ss.2011.036

Keywords

Correlation; feature selection; heavy tail; nonparametric statistics; Studentising; variable selection

Funding

  1. Australian Research Council

Ask authors/readers for more resources

Contemporary problems involving sparse, high-dimensional feature selection are becoming rapidly more challenging through substantial increases in dimension. This places ever more stress on methods for analysis, since the effects of even moderately heavy-tailed feature distributions become more significant as the number of features diverges. Data transformations have a significant role to play, reducing noise and enabling an increase in dimension, and for this reason they are increasingly used. In this paper we examine the performance of a; typical transformation of this type, and study the extent to which it preserves the main attributes that lead to reliable feature selection. We show both numerically and theoretically that, in the presence of heavy-tailed data, the size of the dimension for which effective variable selection is possible can be increased dramatically, from a low-degree polynomial function of sample size to one that is exponentially large.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available