☆ 4.6 Article

Online Extreme Learning Machine with Hybrid Sampling Strategy for Sequential Imbalanced Data

COGNITIVE COMPUTATION (2017)

Journal

COGNITIVE COMPUTATION

Volume 9, Issue 6, Pages 780-800

Publisher

SPRINGER

DOI: 10.1007/s12559-017-9504-2

Keywords

Online sequential extreme learning machine; Imbalance problem; Principal curve; Leave-one-out cross validation

Funding

National Natural Science Foundation of China [61572399, U1204609]
China Postdoctoral Science Foundation [2016T90944]
University Science and Technology Innovation in Henan Province [15HASTIT022]
University Young Core Instructor in Henan Province [2014GGJS046]
Foundation of Henan Normal University for Excellent Young Teachers [14YQ007]
Major Science and Technology Foundation in Guangdong Province of China [2015B010104002]
key Scientific Research Foundation of Henan Provincial University [16A520015, 15A520078]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In real applications of cognitive computation, data with imbalanced classes are used to be collected sequentially. In this situation, some of current machine learning algorithms, e.g., support vector machine, will obtain weak classification performance, especially on minority class. To solve this problem, a new hybrid sampling online extreme learning machine (ELM) on sequential imbalanced data is proposed in this paper. The key idea is keeping the majority and minority classes balanced with similar sequential distribution characteristic of the original data. This method includes two stages. At the offline stage, we introduce the principal curve to build confidence regions of minority and majority classes respectively. Based on these two confidence zones, over-sampling of minority class and under-sampling of majority class are both conducted to generate new synthetic samples, and then, the initial ELM model is established. At the online stage, we first choose the most valuable ones from the synthetic samples of majority class in terms of sample importance. Afterwards, a new online fast leave-one-out cross validation (LOO CV) algorithm utilizing Cholesky decomposition is proposed to determine whether to update the ELM network weight at online stage or not. We also prove theoretically that the proposed method has upper bound of information loss. Experimental results on seven UCI datasets and one real-world air pollutant forecasting dataset show that, compared with ELM, OS-ELM, meta-cognitive OS-ELM, and OSELM with SMOTE strategy, the proposed method can simultaneously improve the classification performance of minority and majority classes in terms of accuracy, G-mean value, and ROC curve. As a conclusion, the proposed hybrid sampling online extreme learning machine can be effectively applied to the sequential data imbalance problem with better generalization performance and numerical stability.

Online Extreme Learning Machine with Hybrid Sampling Strategy for Sequential Imbalanced Data

Journal

COGNITIVE COMPUTATION

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Online Extreme Learning Machine with Hybrid Sampling Strategy for Sequential Imbalanced Data

Journal

COGNITIVE COMPUTATION

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper