☆ 4.2 Article

Robust distributed estimation and variable selection for massive datasets via rank regression

ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS (2022)

Journal

ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS

Volume 74, Issue 3, Pages 435-450

Publisher

SPRINGER HEIDELBERG

DOI: 10.1007/s10463-021-00803-5

Keywords

Massive data; Robustness; Communication efficient; Variable selection

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper introduces a rank regression algorithm suitable for distributed massive data and proposes a distributed regularized rank regression method that can perform consistent variable selection.

Rank regression is a robust modeling tool; it is challenging to implement it for the distributed massive data owing to memory constraints. In practice, the massive data may be distributed heterogeneously from machine to machine; how to incorporate the heterogeneity is also an interesting issue. This paper proposes a distributed rank regression (DR2), which can be implemented in the master machine by solving a weighted least-squares and adaptive when the data are heterogeneous. Theoretically, we prove that the resulting estimator is statistically as efficient as the global rank regression estimator. Furthermore, based on the adaptive LASSO and a newly defined distributed BIC-type tuning parameter selector, we propose a distributed regularized rank regression (DR3), which can make consistent variable selection and can also be easily implemented by using the LARS algorithm on the master machine. Simulation results and real data analysis are included to validate our method.

Robust distributed estimation and variable selection for massive datasets via rank regression

Journal

ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Robust distributed estimation and variable selection for massive datasets via rank regression

Journal

ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper