4.5 Article

Flexible decision tree for data stream classification in the presence of concept change, noise and missing values

Journal

DATA MINING AND KNOWLEDGE DISCOVERY
Volume 19, Issue 1, Pages 95-131

Publisher

SPRINGER
DOI: 10.1007/s10618-009-0130-9

Keywords

Classification learning; Data stream classification; Decision tree learning; Fuzzy learning

Ask authors/readers for more resources

In recent years, classification learning for data streams has become an important and active research topic. A major challenge posed by data streams is that their underlying concepts can change over time, which requires current classifiers to be revised accordingly and timely. To detect concept change, a common methodology is to observe the online classification accuracy. If accuracy drops below some threshold value, a concept change is deemed to have taken place. An implicit assumption behind this methodology is that any drop in classification accuracy can be interpreted as a symptom of concept change. Unfortunately however, this assumption is often violated in the real world where data streams carry noise that can also introduce a significant reduction in classification accuracy. To compound this problem, traditional noise cleansing methods are incompetent for data streams. Those methods normally need to scan data multiple times whereas learning for data streams can only afford one-pass scan because of data's high speed and huge volume. Another open problem in data stream classification is how to deal with missing values. When new instances containing missing values arrive, how a learning model classifies them and how the learning model updates itself according to them is an issue whose solution is far from being explored. To solve these problems, this paper proposes a novel classification algorithm, flexible decision tree (FlexDT), which extends fuzzy logic to data stream classification. The advantages are three-fold. First, FlexDT offers a flexible structure to effectively and efficiently handle concept change. Second, FlexDT is robust to noise. Hence it can prevent noise from interfering with classification accuracy, and accuracy drop can be safely attributed to concept change. Third, it deals with missing values in an elegant way. Extensive evaluations are conducted to compare FlexDT with representative existing data stream classification algorithms using a large suite of data streams and various statistical tests. Experimental results suggest that FlexDT offers a significant benefit to data stream classification in real-world scenarios where concept change, noise and missing values coexist.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Artificial Intelligence

Two-tier network anomaly detection model: a machine learning approach

Hamed Haddad Pajouh, GholamHossein Dastghaibyfard, Sattar Hashemi

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS (2017)

Article Computer Science, Information Systems

Modelling information diffusion based on non-dominated friends in social networks

Niloofar Mozafari, Ali Hamzeh, Sattar Hashemi

JOURNAL OF INFORMATION SCIENCE (2017)

Article Computer Science, Artificial Intelligence

Visual domain adaptation via transfer feature learning

Jafar Tahmoresnezhad, Sattar Hashemi

KNOWLEDGE AND INFORMATION SYSTEMS (2017)

Article Computer Science, Artificial Intelligence

Online Prediction via Continuous Artificial Prediction Markets

Fatemeh Jahedpari, Talal Rahwan, Sattar Hashemi, Tomasz P. Michalak, Marina De Vos, Julian Padget, Wei Lee Woon

IEEE INTELLIGENT SYSTEMS (2017)

Article Computer Science, Artificial Intelligence

Exploiting kernel-based feature weighting and instance clustering to transfer knowledge across domains

Jafar Tahmoresnezhad, Sattar Hashemi

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES (2017)

Article Computer Science, Artificial Intelligence

Domain invariant feature extraction against evasion attack

Zeinab Khorshidpour, Jafar Tahmoresnezhad, Sattar Hashemi, Ali Hamzeh

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS (2018)

Article Computer Science, Information Systems

Know Abnormal, Find Evil: Frequent Pattern Mining for Ransomware Threat Hunting and Intelligence

Sajad Homayoun, Ali Dehghantanha, Marzieh Ahmadzadeh, Sattar Hashemi, Raouf Khayami

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING (2020)

Article Green & Sustainable Science & Technology

A Hybrid clustering and classification technique for forecasting short-term energy consumption

Mehrnoosh Torabi, Sattar Hashemi, Mahmoud Reza Saybani, Shahaboddin Shamshirband, Amir Mosavi

ENVIRONMENTAL PROGRESS & SUSTAINABLE ENERGY (2019)

Article Computer Science, Theory & Methods

DRTHIS: Deep ransomware threat hunting and intelligence system at the fog layer

Sajad Homayoun, Ali Dehghantanha, Marzieh Ahmadzadeh, Sattar Hashemi, Raouf Khayami, Kim-Kwang Raymond Choo, David Ellis Newton

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE (2019)

Article Computer Science, Artificial Intelligence

An entropy-based distance measure for analyzing and detecting metamorphic malware

Esmaeel Radkani, Sattar Hashemi, Alireza Keshavarz-Haddad, Maryam Amir Haeri

APPLIED INTELLIGENCE (2018)

Proceedings Paper Computer Science, Artificial Intelligence

Employing deep learning and sparse representation for data classification

Seyed Mehdi Hazrati Fard, Sattar Hashemi

2017 19TH CSI INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP) (2017)

Proceedings Paper Computer Science, Theory & Methods

A Deep Super-Vector Based Representation for Clustering

Amir Namavar Jahromi, Sattar Hashemi

2017 9TH INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT 2017) (2017)

Article Computer Science, Artificial Intelligence

Evaluation of random forest classifier in security domain

Zeinab Khorshidpour, Sattar Hashemi, Ali Hamzeh

APPLIED INTELLIGENCE (2017)

Article Computer Science, Information Systems

Graph embedding as a new approach for unknown malware detection

Hashem Hashemi, Amin Azmoodeh, Ali Hamzeh, Sattar Hashemi

JOURNAL OF COMPUTER VIROLOGY AND HACKING TECHNIQUES (2017)

Article Engineering, Multidisciplinary

DiReT: An effective discriminative dimensionality reduction approach for multi-source transfer learning

J. Tahmoresnezhad, S. Hashemi

SCIENTIA IRANICA (2017)

No Data Available