☆ 4.7 Article

Outlier detection on uncertain data based on local information

KNOWLEDGE-BASED SYSTEMS (2013)

期刊

KNOWLEDGE-BASED SYSTEMS

卷 51, 期 -, 页码 60-71

出版社

ELSEVIER

DOI: 10.1016/j.knosys.2013.07.005

关键词

Outlier detection; Uncertain data; Local information

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

Reagent

摘要

Based on local information: local density and local uncertainty level, a new outlier detection algorithm is designed in this paper to calculate uncertain local outlier factor (ULOF) for each point in an uncertain dataset. In this algorithm, all concepts, definitions and formulations for conventional local outlier detection approach (LOF) are generalized to include uncertainty information. The least squares algorithm on multi-times curve fitting is used to generate an approximate probability density function of distance between two points. An iteration algorithm is proposed to evaluate K-eta-distance and a pruning strategy is adopted to reduce the size of candidate set of nearest-neighbors. The comparison between ULOF algorithm and the state-of-the-art approaches has been made. Results of several experiments on synthetic and real data sets demonstrate the effectiveness of the proposed approach. (C) 2013 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Computer Science, Information Systems

Incomplete mixed data-driven outlier detection based on local-global neighborhood information

Ran Li, Hongchang Chen, Shuxin Liu, Xing Li, Yingle Li, Biao Wang

Summary: Outlier detection is a challenging task due to the nature of ubiquitous, incomplete, redundant, noisy, and mixed data. To address this challenge, this paper proposes an ILGNI network that considers both local and global information from incomplete mixed data. The network enhances connectivity between similar objects and weakens connectivity between heterogeneous objects, allowing for efficient graph-based outlier detection. Experiments on telecom fraud datasets demonstrate that the proposed algorithm achieves enhanced outlier detection performance with low time complexity and is applicable to various types of datasets.

INFORMATION SCIENCES (2023)

添加到收藏夹

Article Computer Science, Hardware & Architecture

An enhanced local outlier detection using random walk on grid information graph

Chunyan She, Shaohua Zeng

Summary: A novel local outlier detection method based on grid random walk is proposed in this work, which uses stationary distribution vector to find candidate outliers for improving efficiency. A new local outlier factor is constructed to estimate the abnormal degree of each object, and experimental results show that the proposed algorithm has better performance and lower running time compared to others.

JOURNAL OF SUPERCOMPUTING (2022)

添加到收藏夹

Article Mathematics

Local Correlation Integral Approach for Anomaly Detection Using Functional Data

Jorge R. Sosa Donoso, Miguel Flores, Salvador Naya, Javier Tarrio-Saavedra

Summary: This work presents a methodology for detecting outliers in functional data that considers both their shape and magnitude. The Local Correlation Integral (LOCI) method, a multivariate anomaly detection technique, has been extended and adapted for functional data using distance calculations in Hilbert spaces. The methodology has been validated through simulation studies and application to real data, showing good performance in scenarios with inter-curve dependence, particularly when outliers are due to curve magnitudes. Results are further supported by the successful application of the methodology to a meteorological database, outperforming other competitive methods.

MATHEMATICS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

CELOF: Effective and fast memory efficient local outlier detection in high-dimensional data streams

Liang Chen, Wei Wang, Yun Yang

Summary: This paper introduces a new algorithm called CELOF for real-time outlier detection on data streams, which effectively overcomes two main limitations of traditional algorithms. Experimental results show that the CELOF algorithm has an average improvement of 15% in accuracy and runs in less than 1% of the time of the original LOF, making it widely applicable in various practical scenarios.

APPLIED SOFT COMPUTING (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Attribute-weighted outlier detection for mixed data based on parallel mutual information

Junli Li, Zhanfeng Liu

Summary: Outlier detection plays a crucial role in data mining. However, most existing algorithms focus on either numerical or categorical attributes and neglect the mixture of attributes commonly found in real-world data. In this study, we propose a high-dimensional and massive mixed data outlier detection algorithm called PMIOD, which incorporates attribute weighting using mutual information. We also parallelize the mutual information computation on the Spark platform to improve efficiency. Experimental results on various datasets demonstrate the superior performance of the proposed algorithm.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Uncertain distance-based outlier detection with arbitrarily shaped data objects

Fabrizio Angiulli, Fabio Fassetti

Summary: This paper discusses the problem of unsupervised outlier detection in large collections of data objects with uncertainty, proposes a definition of uncertain distance-based outliers, and designs the UDBOD algorithm for outlier detection. Experimental results demonstrate the effectiveness and efficiency of the algorithm in uncertain datasets.

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A High-Dimensional Outlier Detection Approach Based on Local Coulomb Force

Pengyun Zhu, Chaowei Zhang, Xiaofeng Li, Jifu Zhang, Xiao Qin

Summary: Traditional outlier detection methods are not suitable for high-dimensional data analysis due to the curse of dimensionality. Inspired by Coulomb's law, a new similarity measure vector is proposed for high-dimensional data, which incorporates outlier Coulomb force and outlier Coulomb resultant force. The algorithm effectively measures similarity and differences among data objects, and provides interpretable results with the Coulomb resultant force. The algorithm is evaluated using UCI and synthetic datasets, demonstrating its effectiveness in alleviating the curse of dimensionality and producing interpretable high-dimensional outlier data.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2023)

添加到收藏夹

Article Physics, Multidisciplinary

An Ensemble Outlier Detection Method Based on Information Entropy-Weighted Subspaces for High-Dimensional Data

Zihao Li, Liumei Zhang

Summary: This paper proposes a new outlier detection algorithm called EOEH, which improves the detection performance of high-dimensional data by utilizing random subsampling and information entropy-weighted subspaces. Through experiments, it is demonstrated that EOEH algorithm outperforms popular outlier detection algorithms in terms of detection performance and runtime efficiency.

ENTROPY (2023)

添加到收藏夹

Article Multidisciplinary Sciences

A simple method for unsupervised anomaly detection: An application to Web time series data

Keisuke Yoshihara, Kei Takahashi

Summary: A simple anomaly detection method for unlabeled time series data is proposed, using log-likelihood ratio estimation and density ratio estimation. The study suggests the importance of incorporating specific information into the model for time series anomaly detection.

PLOS ONE (2022)

添加到收藏夹

Article Computer Science, Information Systems

Outlier detection from multiple data sources

Yang Ma, Xujun Zhao, Chaowei Zhang, Jifu Zhang, Xiao Qin

Summary: The study proposes multi-source outlier detection techniques to reliably identify outliers in multiple datasets based on the unique characteristics of multi-source outliers; attempts to classify multi-source outliers into three types and designs multiple algorithms to improve the efficiency and accuracy of outlier detection.

INFORMATION SCIENCES (2021)

添加到收藏夹

Article Computer Science, Hardware & Architecture

Minimal Rare Pattern-Based Outlier Detection Approach For Uncertain Data Streams Under Monotonic Constraints

Saihua Cai, Jinfu Chen, Haibo Chen, Chi Zhang, Qian Li, Dengzhou Shi, Wei Lin

Summary: This study proposes a novel outlier detection approach, CMRP-OD, which achieves improved accuracy and efficiency by compressing pattern scale and considering more factors.

COMPUTER JOURNAL (2023)

添加到收藏夹

Article Computer Science, Information Systems

Boundary-aware local Density-based outlier detection

Fatih Aydin

Summary: Outlier detection is crucial for improving the performance of machine learning algorithms, especially in data sets with a small number of points. To address this, the proposed unsupervised method utilizes the Chebyshev inequality to draw neighborhood boundaries and detects outliers by quantifying their neighborhood densities. Experimental results demonstrate the efficacy of this approach compared to state-of-the-art methods.

INFORMATION SCIENCES (2023)

添加到收藏夹

Article Physics, Multidisciplinary

Outlier Detection with Reinforcement Learning for Costly to Verify Data

Michiel Nijhuis, Iman van Lelyveld

Summary: Outliers are commonly found in data, and various algorithms exist to detect them. The verification of these outliers can determine whether they are data errors or not. However, this verification process is time-consuming and the underlying issues leading to the data error can change over time. Therefore, using reinforcement learning on a statistical outlier detection approach can optimize the detection process by adjusting the coefficients of the ensemble model with every new piece of data.

ENTROPY (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

An efficient local outlier detection optimized by rough clustering

Chunyan She, Shaohua Zeng

Summary: This article discusses the significance of outlier detection in data mining and proposes a rough clustering based outlier detection method. The method divides the dataset into subsets and considers the local distribution of objects, improving the algorithm's speed and accuracy.

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS (2022)

添加到收藏夹

Article Automation & Control Systems

Adaptive dual control with online outlier detection for uncertain systems

Xuehui Ma, Fucai Qian, Shiliang Zhang, Li Wu, Lei Liu

Summary: This paper proposes an adaptive dual control with outlier detection mechanism to enhance parameter estimation and control performance of uncertain systems. The improved approach is verified using simulations and evaluated in a practical scenario of fermentation sterilization process.

ISA TRANSACTIONS (2022)

添加到收藏夹

暂无数据

Article Computer Science, Artificial Intelligence

Confidence-based and sample-reweighted test-time adaptation

Hao Yang, Min Wang, Zhengfei Yu, Hang Zhang, Jinshen Jiang, Yun Zhou

Summary: In this paper, a novel method called CSTTA is proposed for test time adaptation (TTA), which utilizes confidence-based optimization and sample reweighting to better utilize sample information. Extensive experiments demonstrate the effectiveness of the proposed method.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A novel method for generating a canonical basis for decision implications based on object-induced three-way operators

Jin Liu, Ju-Sheng Mi, Dong-Yun Niu

Summary: This article focuses on a novel method for generating a canonical basis for decision implications based on object-induced operators (OE operators). The logic of decision implication based on OE operators is described, and a method for obtaining the canonical basis for decision implications is given. The completeness, nonredundancy, and optimality of the canonical basis are proven. Additionally, a method for generating true premises based on OE operators is proposed.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Review Computer Science, Artificial Intelligence

Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning

Kun Bu, Yuanchao Liu, Xiaolong Ju

Summary: This paper discusses the importance of sentiment analysis and pre-trained models in natural language processing, and explores the application of prompt learning. The research shows that prompt learning is more suitable for sentiment analysis tasks and can achieve good performance.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

M-EDEM: A MNN-based Empirical Decomposition Ensemble Method for improved time series forecasting

Xiangjun Cai, Dagang Li

Summary: This paper presents a new decomposition mechanism based on learned decomposition mapping. By using a neural network to learn the relationship between original time series and decomposed results, the repetitive computation overhead during rolling decomposition is relieved. Additionally, extended mapping and partial decomposition methods are proposed to alleviate boundary effects on prediction performance. Comparative studies demonstrate that the proposed method outperforms existing RDEMs in terms of operation speed and prediction accuracy.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Privacy-preserving trust management method based on blockchain for cross-domain industrial IoT

Xu Wu, Yang Liu, Jie Tian, Yuanpeng Li

Summary: This paper proposes a blockchain-based privacy-preserving trust management architecture, which adopts federated learning to train task-specific trust models and utilizes differential privacy to protect device privacy. In addition, a game theory-based incentive mechanism and a parallel consensus protocol are proposed to improve the accuracy of trust computing and the efficiency of consensus.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

MV-ReID: 3D Multi-view Transformation Network for Occluded Person Re-Identification

Zaiyang Yu, Prayag Tiwari, Luyang Hou, Lusi Li, Weijun Li, Limin Jiang, Xin Ning

Summary: This study introduces a 3D view-based approach that effectively handles occlusions and leverages the geometric information of 3D objects. The proposed method achieves state-of-the-art results on occluded ReID tasks and exhibits competitive performance on holistic ReID tasks.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

City-scale continual neural semantic mapping with three-layer sampling and panoptic representation

Yongliang Shi, Runyi Yang, Zirui Wu, Pengfei Li, Caiyun Liu, Hao Zhao, Guyue Zhou

Summary: Neural implicit representations have gained attention due to their expressive, continuous, and compact properties. However, there is still a lack of research on city-scale continual implicit dense mapping based on sparse LiDAR input. In this study, a city-scale continual neural mapping system with a panoptic representation is developed, incorporating environment-level and instance-level modeling. A tailored three-layer sampling strategy and category-specific prior are proposed to address the challenges of representing geometric information in city-scale space and achieving high fidelity mapping of instances under incomplete observation.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

MDSSN: An end-to-end deep network on triangle mesh parameterization

Ruihan Hu, Zhi-Ri Tang, Rui Yang, Zhongjie Wang

Summary: Mesh data is crucial for 3D computer vision applications worldwide, but traditional deep learning frameworks have struggled with handling meshes. This paper proposes MDSSN, a simple mesh computation framework that models triangle meshes and represents their shape using face-based and edge-based Riemannian graphs. The framework incorporates end-to-end operators inspired by traditional deep learning frameworks, and includes dedicated modules for addressing challenges in mesh classification and segmentation tasks. Experimental results demonstrate that MDSSN outperforms other state-of-the-art approaches.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Semi-supervised learning with missing values imputation

Buliao Huang, Yunhui Zhu, Muhammad Usman, Huanhuan Chen

Summary: This paper proposes a novel semi-supervised conditional normalizing flow (SSCFlow) algorithm that combines unsupervised imputation and supervised classification. By estimating the conditional distribution of incomplete instances, SSCFlow facilitates imputation and classification simultaneously, addressing the issue of separated tasks ignoring data distribution and label information in traditional methods.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Emotion-and-knowledge grounded response generation in an open-domain dialogue setting

Deeksha Varshney, Asif Ekbal, Erik Cambria

Summary: This paper focuses on the neural-based interactive dialogue system that aims to engage and retain humans in long-lasting conversations. It proposes a new neural generative model that combines step-wise co-attention, self-attention-based transformer network, and an emotion classifier to control emotion and knowledge transfer during response generation. The results from quantitative, qualitative, and human evaluation show that the proposed models can generate natural and coherent sentences, capturing essential facts with significant improvement over emotional content.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

MvTS-library: An open library for deep multivariate time series forecasting

Junchen Ye, Weimiao Li, Zhixin Zhang, Tongyu Zhu, Leilei Sun, Bowen Du

Summary: Modeling multivariate time series has long been a topic of interest for scholars in various fields. This paper introduces MvTS, an open library based on Pytorch, which provides a unified framework for implementing and evaluating these models. Extensive experiments on public datasets demonstrate the effectiveness and universality of the models reproduced by MvTS.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

An adaptive hybrid mutated differential evolution feature selection method for low and high-dimensional medical datasets

Reham R. Mostafa, Ahmed M. Khedr, Zaher Al Aghbari, Imad Afyouni, Ibrahim Kamel, Naveed Ahmed

Summary: Feature selection is crucial in classification procedures, but it faces challenges in high-dimensional datasets. To overcome these challenges, this study proposes an Adaptive Hybrid-Mutated Differential Evolution method that incorporates the mechanics of the Spider Wasp Optimization algorithm and the concept of Enhanced Solution Quality. Experimental results demonstrate the effectiveness of the method in terms of accuracy and convergence speed, and it outperforms contemporary cutting-edge algorithms.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

TCM Model for improving track sequence classification in real scenarios with Multi-Feature Fusion and Transformer Block

Ti Xiang, Pin Lv, Liguo Sun, Yipu Yang, Jiuwu Hao

Summary: This paper introduces a Track Classification Model (TCM) based on marine radar, which can effectively recognize and classify shipping tracks. By using a feature extraction network with multi-feature fusion and a dataset production method to address missing labels, the classification accuracy is improved, resulting in successful engineering application in real scenarios.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Language model as an Annotator: Unsupervised context-aware quality phrase generation

Zhihao Zhang, Yuan Zuo, Chenghua Lin, Junjie Wu

Summary: This paper proposes a novel unsupervised context-aware quality phrase mining framework called LMPhrase, which is built upon large pre-trained language models. The framework mines quality phrases as silver labels using a parameter-free probing technique on the pre-trained language model BERT, and formalizes the phrase tagging task as a sequence generation problem by fine-tuning on the Sequence to-Sequence pre-trained language model BART. The results of extensive experiments show that LMPhrase consistently outperforms existing competitors in two different granularity phrase mining tasks.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Stochastic Gradient Descent for matrix completion: Hybrid parallelization on shared- and distributed-memory systems

Kemal Buyukkaya, M. Ozan Karsavuran, Cevdet Aykanat

Summary: The study aims to investigate the hybrid parallelization of the Stochastic Gradient Descent (SGD) algorithm for solving the matrix completion problem on a high-performance computing platform. A hybrid parallel decentralized SGD framework with asynchronous inter-process communication and a novel flexible partitioning scheme is proposed to achieve scalability up to hundreds of processors. Experimental results on real-world benchmark datasets show that the proposed algorithm achieves 6x higher throughput on sparse datasets compared to the state-of-the-art, while achieving comparable throughput on relatively dense datasets.

KNOWLEDGE-BASED SYSTEMS (2024)

添加到收藏夹

© Peeref 2019-2024. All rights reserved.