Journal
STATISTICA SINICA
Volume 22, Issue 3, Pages 909-932Publisher
STATISTICA SINICA
DOI: 10.5705/ss.2011.036
Keywords
Correlation; feature selection; heavy tail; nonparametric statistics; Studentising; variable selection
Categories
Funding
- Australian Research Council
Ask authors/readers for more resources
Contemporary problems involving sparse, high-dimensional feature selection are becoming rapidly more challenging through substantial increases in dimension. This places ever more stress on methods for analysis, since the effects of even moderately heavy-tailed feature distributions become more significant as the number of features diverges. Data transformations have a significant role to play, reducing noise and enabling an increase in dimension, and for this reason they are increasingly used. In this paper we examine the performance of a; typical transformation of this type, and study the extent to which it preserves the main attributes that lead to reliable feature selection. We show both numerically and theoretically that, in the presence of heavy-tailed data, the size of the dimension for which effective variable selection is possible can be increased dramatically, from a low-degree polynomial function of sample size to one that is exponentially large.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available