4.8 Article

Scalable Approximate FRNN-OWA Classification

期刊

IEEE TRANSACTIONS ON FUZZY SYSTEMS
卷 28, 期 5, 页码 929-938

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TFUZZ.2019.2949769

关键词

Open wireless architecture; Rough sets; Training; Time complexity; Approximation algorithms; Machine learning algorithms; Fuzzy systems; Big data applications; classification algorithms; fuzzy rough sets; nearest neighbor searches; scalability

资金

  1. Odysseus programme of the Research Foundation-Flanders (FWO)
  2. Research Foundation-Flanders (FWO) [170303/12X1619N]

向作者/读者索取更多资源

Fuzzy rough nearest neighbor classification with ordered weighted averaging operators (FRNN-OWA) is an algorithm that classifies unseen instances according to their membership in the fuzzy upper and lower approximations of the decision classes. Previous research has shown that the use of OWA operators increases the robustness of this model. However, calculating membership in an approximation requires a nearest neighbor search. In practice, the query time complexity of exact nearest neighbor search algorithms in more than a handful of dimensions is near linear, which limits the scalability of FRNN-OWA. Therefore, we propose approximate FRNN-OWA, a modified model that calculates upper and lower approximations of decision classes using the approximate nearest neighbors returned by hierarchical navigable small worlds (HNSW), a recent approximative nearest neighbor search algorithm with logarithmic query time complexity at constant near-100% accuracy. We demonstrate that approximate FRNN-OWA is sufficiently robust to match the classification accuracy of exact FRNN-OWA while scaling much more efficiently. We test four parameter configurations of HNSW and evaluate their performance by measuring classification accuracy and construction and query times for samples of various sizes from three large datasets. We find that with two of the parameter configurations, approximate FRNN-OWA achieves near-identical accuracy to exact FRNN-OWA for most sample sizes within query times that are up to several orders of magnitude faster.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Artificial Intelligence

Fuzzy extensions of the dominance-based rough set approach

Marko Palangetic, Chris Cornelis, Salvatore Greco, Roman Slowinski

Summary: This paper extends the fuzzy dominance-based rough set approach (DRSA) and explores the application of Ordered Weighted Average (OWA) operators. The theoretical properties of hybridizing OWA operators with fuzzy DRSA are examined, and the robustness of the standard fuzzy DRSA approach is experimentally compared with the OWA approach.

INTERNATIONAL JOURNAL OF APPROXIMATE REASONING (2021)

Article Computer Science, Theory & Methods

Granular representation of OWA-based fuzzy rough sets

Marko Palangetic, Chris Cornelis, Salvatore Greco, Roman Slowinski

Summary: This paper discusses the importance of granular representations of crisp and fuzzy sets in rule induction algorithms based on rough set theory. It demonstrates that the OWA-based fuzzy rough set model, which has been successfully applied in various machine learning tasks, allows for a granular representation. The practical implications of this result for rule induction from fuzzy rough approximations are highlighted.

FUZZY SETS AND SYSTEMS (2022)

Article Computer Science, Artificial Intelligence

Average Localised Proximity: A new data descriptor with good default one-class classification performance

Oliver Urs Lenz, Daniel Peralta, Chris Cornelis

Summary: The study addresses the challenge of setting hyperparameters in one-class classification, introduces a new data descriptor ALP, evaluates it on a large collection of datasets, showing that ALP outperforms other descriptors, making it a good default choice.

PATTERN RECOGNITION (2021)

Article Computer Science, Artificial Intelligence

SCMFTS: Scalable and Distributed Complexity Measures and Features for Univariate and Multivariate Time Series in Big Data Environments

Francisco J. Baldan, Daniel Peralta, Yvan Saeys, Jose M. Benitez

Summary: Time series data is increasingly important in Big Data environments, with a lack of tools for time series processing identified as a challenge. A new approach based on time series features has shown significant progress in addressing time series problems, and has demonstrated outstanding performance on the latest datasets.

INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS (2021)

Article Computer Science, Artificial Intelligence

Optimised one-class classification performance

Oliver Urs Lenz, Daniel Peralta, Chris Cornelis

Summary: The study provides a thorough treatment of one-class classification with hyperparameter optimisation for five data descriptors. Experimental results show that the recent Malherbe-Powell proposal optimises the hyperparameters of all data descriptors most efficiently.

MACHINE LEARNING (2022)

Article Computer Science, Artificial Intelligence

Choquet-based fuzzy rough sets

Adnan Theerens, Oliver Urs Lenz, Chris Cornelis

Summary: Fuzzy rough set theory is a useful tool for dealing with inconsistent data in machine learning applications. The ordered weighted average (OWA) based fuzzy rough sets provide a solution to the problem of sensitivity to outliers in classical fuzzy rough sets. By extending it to Choquet-based fuzzy rough sets (CFRS), more flexibility and robustness can be achieved, including seamless integration of outlier detection algorithms.

INTERNATIONAL JOURNAL OF APPROXIMATE REASONING (2022)

Article Engineering, Electrical & Electronic

Time-Series-Based Feature Selection and Clustering for Equine Activity Recognition Using Accelerometers

Timo De Waele, Adnan Shahid, Daniel Peralta, Anniek Eerdekens, Margot Deruyck, Frank A. M. Tuyttens, Eli De Poorter

Summary: To track the activities and performance of horses, inertial measurement units (IMUs) combined with machine learning algorithms are commonly used. A data-efficient algorithm is proposed that only requires 3 minutes of labeled calibration data. This algorithm achieved a 95% accuracy on datasets captured with leg-mounted IMUs and neck-mounted IMU. However, when the algorithm was calibrated on multiple horses and evaluated on unfamiliar horses, there was a 15% drop in classification accuracy.

IEEE SENSORS JOURNAL (2023)

Proceedings Paper Computer Science, Artificial Intelligence

A study on the calibration of fingerprint classifiers

Daniel Peralta, Lin Tang, Maxim Lippeveld, Yvan Saeys

Summary: This paper studies the problem of overconfident predictions in fingerprint classification and proposes calibration methods and a modified search strategy to address it. Experimental results show that Dirichlet calibration can improve predicted class probabilities and reduce penetration rate while maintaining a good balance in performance.

2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Attribute Reduction Using Functional Dependency Relations in Rough Set Theory

Mauricio Restrepo, Chris Cornelis

Summary: The paper introduces functional dependency relations defined on the attribute set of an information system, and establishes basic relationships between functional dependency relations, attribute reduction, and closure operators. It demonstrates that reducts of an information system can be obtained from the maximal elements of a functional dependency relation using the partial order for dependencies.

ROUGH SETS (IJCRS 2021) (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Fuzzy-Rough Nearest Neighbour Approaches for Emotion Detection in Tweets

Olha Kaminska, Chris Cornelis, Veronique Hoste

Summary: Social media provide meaningful data for tasks like sentiment analysis and emotion recognition, which are often solved using deep learning methods. Due to the fuzzy nature of textual data, using classification methods based on fuzzy rough sets is considered. An approach for the SemEval-2018 emotion detection task using FRNN-OWA models and different text embedding methods achieved competitive results against more complex deep learning methods.

ROUGH SETS (IJCRS 2021) (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Adapting Fuzzy Rough Sets for Classification with Missing Values

Oliver Urs Lenz, Daniel Peralta, Chris Cornelis

Summary: The proposal suggests using interval-valued fuzzy sets to model concepts in datasets with missing values, expressing uncertainty through optimistic and pessimistic approximations. In a small experiment, it outperforms simple imputation methods like mean and mode on datasets with low missing value rates.

ROUGH SETS (IJCRS 2021) (2021)

暂无数据