☆ 4.6 Article

Martingale Difference Correlation and Its Use in High-Dimensional Variable Screening

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2014)

期刊

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

卷 109, 期 507, 页码 1302-1318

出版社

AMER STATISTICAL ASSOC

DOI: 10.1080/01621459.2014.887012

关键词

High-dimensional inference; Sure screening property; Feature screening; Conditional mean

类别

Statistics & Probability

资金

NSF [DMS08-04937, DMS11-04545]

向作者/读者索取更多资源

Protocol

Reagent

摘要

In this article, we propose a new metric, the so-called martingale difference correlation, to measure the departure of conditional mean independence between a scalar response variable V and a vector predictor variable U. Our metric is a natural extension of distance correlation proposed by Szekely, Rizzo, and Bahirov, which is used to measure the dependence between V and U. The martingale difference correlation and its empirical counterpart inherit a number of desirable features of distance correlation and sample distance correlation, such as algebraic simplicity and elegant theoretical properties. We further use martingale difference correlation as a marginal utility to do high-dimensional variable screening to screen out variables that do not contribute to conditional mean of the response given the covariates. Further extension to conditional quantile screening is also described in detail and sure screening properties are rigorously justified. Both simulation results and real data illustrations demonstrate the effectiveness of martingale difference correlation-based screening procedures in comparison with the existing counterparts. Supplementary materials for this article are available online.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Computer Science, Interdisciplinary Applications

Censored mean variance sure independence screening for ultrahigh dimensional survival data

Wei Zhong, Jiping Wang, Xiaolin Chen

Summary: Feature screening is essential for ultrahigh dimensional data analysis, and a new model-free marginal feature screening approach for survival data with right censoring is proposed. The method, based on censored mean variance index, is robust to model misspecification and can identify important covariates for both categorical and continuous data. The proposed approach has been demonstrated through simulations and a real data example to have competitive performance.

COMPUTATIONAL STATISTICS & DATA ANALYSIS (2021)

添加到收藏夹

Article Statistics & Probability

Conditional distance correlation sure independence screening for ultra-high dimensional survival data

Shuiyun Lu, Xiaolin Chen, Hong Wang

Summary: This article introduces a new conditional feature screening procedure for ultra-high dimensional survival data using conditional distance correlation. It is model-free and robust to heavy tails or extreme values in both covariates and response. Simulation studies and analysis of real data illustrate the advantages of the proposed approach over existing methods.

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS (2021)

添加到收藏夹

Article Statistics & Probability

High-dimensional variable screening through kernel-based conditional mean dependence

Daojiang He, Jinjiao Cheng, Kai Xu

Summary: This article proposes a kernel-based method for feature screening in ultrahigh-dimensional data. The method demonstrates sure screening and rank consistency properties under weak assumptions. Furthermore, it shows that statistics generated by kernels in the distance kernel family are more sensitive for feature screening in ultra-high dimensions.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2023)

添加到收藏夹

Article Automation & Control Systems

Variational Inference in high-dimensional linear regression

Sumit Mukherjee, Subhabrata Sen

Summary: We study high-dimensional bayesian linear regression with product priors. We derive sufficient conditions for the leading-order correctness of the naive mean-field approximation to the log-normalizing constant of the posterior distribution using the theory of non-linear large deviations. Assuming a true linear model for the observed data, we derive a limiting infinite dimensional variational formula for the log normalizing constant. Additionally, we establish a unique optimizer for the variational problem under an additional separation condition, which governs the probabilistic properties of the posterior distribution. We provide intuitive sufficient conditions for the validity of this separation condition and illustrate the results using concrete examples.

JOURNAL OF MACHINE LEARNING RESEARCH (2022)

添加到收藏夹

Article Mathematics, Interdisciplinary Applications

Feature Screening for High-Dimensional Survival Data via Censored Quantile Correlation

Kai Xu, Xudong Huang

Summary: This paper proposes a new sure independence screening procedure for high-dimensional survival data based on censored quantile correlation, which is robust against outliers and capable of discovering the nonlinear relationship between variables. Simulation results demonstrate its competitive performance on survival datasets with high-dimensional predictors.

JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY (2021)

添加到收藏夹

Article Statistics & Probability

Inference for High-Dimensional Censored Quantile Regression

Zhe Fei, Qi Zheng, Hyokyoung G. Hong, Yi Li

Summary: This study proposes a novel method within the framework of global censored quantile regression to draw inference on the effects of high-dimensional predictors. The method investigates covariate-response associations over an interval of quantile levels and properly quantifies the uncertainty of the estimates.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2023)

添加到收藏夹

Article Automation & Control Systems

i-SISSO: Mutual information-based improved sure independent screening and sparsifying operator algorithm

Yuqin Xu, Quan Qian

Summary: This study addresses the problem of combinatorial explosion in the SISSO method by using the mRMR algorithm, resulting in improved efficiency and accuracy. Experimental results demonstrate that the mutual information-based SISSO method significantly reduces time consumption while maintaining the error close to that of the original SISSO model.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2022)

添加到收藏夹

Article Statistics & Probability

Robust sure independence screening for nonpolynomial dimensional generalized linear models

Abhik Ghosh, Erica Ponzi, Torkjel Sandanger, Magne Thoresen

Summary: In this paper, we discuss a new robust screening procedure based on MDPDE for variable screening in ultra-high-dimensional GLMs. Our proposed method performs well under pure and contaminated data scenarios. The theoretical motivation and proof for the use of marginal MDPDEs, as well as the derivation of a reliable conditional screening method for GLMs, are also provided.

SCANDINAVIAN JOURNAL OF STATISTICS (2023)

添加到收藏夹

Article Statistics & Probability

Testing the Effects of High-Dimensional Covariates via Aggregating Cumulative Covariances

Runze Li, Kai Xu, Yeqing Zhou, Liping Zhu

Summary: In this article, we propose a novel test based on an aggregation of the marginal cumulative covariances to accommodate heteroscedasticity and high dimensionality in high-dimensional data. Our proposed test statistic is scale-invariance, tuning-free, and easy to implement, with established asymptotic normality under the null hypothesis. We find that our proposed test is much more powerful than existing competitors for covariates with heterogeneous variances, even under high-dimensional linear models, while maintaining high efficiency for homoscedastic covariates.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2023)

添加到收藏夹

Article Mathematics, Applied

Feature screening for ultrahigh-dimensional binary classification via linear projection

Peng Lai, Mingyue Wang, Fengli Song, Yanqiu Zhou

Summary: Linear discriminant analysis (LDA) is widely used in discriminant classification and pattern recognition. However, it fails when dealing with high or ultrahigh-dimensional data. To address this, a feature screening procedure based on Fisher's linear projection and marginal score test is proposed. The procedure ensures that important features are retained and irrelevant predictors are eliminated. Monte Carlo simulation studies and a real-life data example are used to assess its finite sample properties.

AIMS MATHEMATICS (2023)

添加到收藏夹

Article Statistics & Probability

SECOND-ORDER STEIN: SURE FOR SURE AND OTHER APPLICATIONS IN HIGH-DIMENSIONAL INFERENCE

Pierre C. Bellec, Cun-Hui Zhang

Summary: This paper introduces a second-order Stein formula to characterize the variance of random variables for functions with square integrable gradient, demonstrating its usefulness in various applications. Additionally, it presents statistical applications such as SURE estimation, confidence intervals, and upper bounds on model selection variance.

ANNALS OF STATISTICS (2021)

添加到收藏夹

Article Statistics & Probability

ON SURE SCREENING WITH MULTIPLE RESPONSES

Di He, Yong Zhou, Hui Zou

Summary: This article systematically studies variable screening methods for multi-response data, proposing a new model-free screening method called mRCC. The sure screening property of mRCC is established under weak regularity conditions, and extensive numerical experiments demonstrate its superior performance over other available alternatives.

STATISTICA SINICA (2021)

添加到收藏夹

Article Mathematics

Asymptotic Results of Some Conditional Nonparametric Functional Parameters in High-Dimensional Associated Data

Hamza Daoudi, Zouaoui Chikr Elmezouar, Fatimah Alshahrani

Summary: This paper investigates the asymptotic properties of conditional functional parameters for an explanatory variable with values in an infinite-dimensional Hilbert space and a response variable in a quasi-associated dependency framework. The non-parametric estimation of the conditional distribution function is studied using the kernel method in the presence of quasi-associated dependence. The almost complete convergence of the estimator in the associated case is established under general hypotheses. The conditional hazard function is estimated using the two outcomes of the conditional distribution function and the conditional density. The asymptotic normality of the kernel estimator is established, and the asymptotic variance is explicitly given. Simulation studies are conducted to examine the behavior of the asymptotic property with finite sample data.

MATHEMATICS (2023)

添加到收藏夹

Article Economics

Feature Screening in High Dimensional Regression with Endogenous Covariates

Qinqin Hu, Lu Lin

Summary: A new feature screening tool and a two-stage regularization framework were proposed to tackle high dimensionality and endogeneity issues, demonstrating consistency in ranking with exponential growth of predictors. Simulation studies supported the effectiveness of the proposed method.

COMPUTATIONAL ECONOMICS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Deep feature screening: Feature selection for ultra high-dimensional data via deep neural networks

Kexuan Li, Fangfang Wang, Lingli Yang, Ruiqi Liu

Summary: In this paper, a novel two-step nonparametric approach called Deep Feature Screening (DeepFS) is proposed to address the challenges in applying traditional statistical feature selection methods to high-dimension, low-sample-size data. DeepFS combines the strengths of deep neural networks and feature screening, and it is model-free, distribution-free, and capable of recovering the original input data. Extensive simulation studies and real data analyses demonstrate the superiority of DeepFS.

NEUROCOMPUTING (2023)

添加到收藏夹

暂无数据

暂无数据

© Peeref 2019-2024. All rights reserved.