☆ 4.7 Article

Computing exact permutation p-values for association rules

INFORMATION SCIENCES (2016)

期刊

INFORMATION SCIENCES

卷 346, 期 -, 页码 146-162

出版社

ELSEVIER SCIENCE INC

DOI: 10.1016/j.ins.2016.01.094

关键词

Association rule mining; Statistical significance testing; Permutation testing; Exact permutation p-value

类别

Computer Science, Information Systems

资金

Natural Science Foundation of China [61572094, 61501389]
Fundamental Research Funds for the Central Universities of China [DUT14QY07]
Hong Kong Research Grant Council [HKBU_22302815]
Hong Kong Baptist University [FRG2/14-15/069]

向作者/读者索取更多资源

Protocol

Reagent

摘要

Association rule mining is an important task in the field of data mining, and many efficient algorithms have been proposed to address this problem. However, a large portion of the rules reported by these algorithms just satisfy the user-defined constraints purely by accident, and those that are not statistically meaningful should be filtered out through statistical significance testing. In the context of association rule discovery, the permutation based approach can achieve better performance than other competitive methods, although several drawbacks of this effective approach narrow its usability. In this paper, we provide an analysis of these disadvantages and propose an algorithm called Exact Permutation p-values for Association Rules (EPAR) to calculate the exact p-values of all tested rules. Experiments on different types of data sets demonstrate that EPAR can successfully alleviate the disadvantages and outperform the direct permutation-based method over several performance measures. (C) 2016 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Multidisciplinary Sciences

PCAtest: testing the statistical significance of Principal Component Analysis in R

Arley Camargo

Summary: Principal Component Analysis (PCA) is a widely used statistical method for ordination and dimensionality reduction of multivariate datasets. In this article, the author introduces the importance of PCA and presents the PCAtest package, which implements permutation-based statistical tests to evaluate the significance of PCA and the contributions of variables to the PC axes. The author encourages R users to routinely apply PCAtest for testing the significance of their PCA before interpreting PC axes and utilizing PC scores in subsequent analyses.

PEERJ (2022)

添加到收藏夹

Article Clinical Neurology

Misinterpretations of Null Hypothesis Significance Testing Results Near the P-Value Threshold in the Neurosurgical Literature

Najib E. El Tecle, Jorge F. Urquiaga, Samuel T. Griffin, Georgios Alexopoulos, Tarek Y. El Ahmadieh, Salah G. Aoun, Tobias A. Mattei

Summary: The study revealed that misinterpretations of null hypothesis significance testing results near the P-value threshold are present in at least 1% of neurosurgical literature. While most statistical errors may be unintentional, additional measures should be implemented to prevent the future adoption of such undesirable methodological practices among researchers.

WORLD NEUROSURGERY (2022)

添加到收藏夹

Article Computer Science, Interdisciplinary Applications

Exact inference around ordinal measures of association is often not exact

Alan D. Hutson, Han Yu

Summary: In this paper, we extend the permutation test approach based on the Pearson correlation coefficient to ordinal measures of association, building upon the work of DiCiccio and Romano (2017). We investigate commonly used ordinal measures, such as the Spearman correlation, Kendall's tau-b, and gamma, and find that asymptotically correct tests perform well for moderate to large sample sizes. Our findings align with previous research, indicating that exact permutation tests based on ordinal measures of association are often not exact.

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE (2023)

添加到收藏夹

Review Computer Science, Artificial Intelligence

A comprehensive review of visualization methods for association rule mining: Taxonomy, challenges, open problems and future ideas

Iztok Fister Jr, Iztok Fister, Dusan Fister, Vili Podgorelec, Sancho Salcedo-Sanz

Summary: Association rule mining aims to search for relationships between attributes in transaction databases. The process involves pre-processing techniques, rule mining, and post-processing with visualization. This review paper provides a literature review and analysis of techniques, applications, and future research directions in association rule mining and visualization.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

添加到收藏夹

Editorial Material Oncology

Assessing risk factors with information beyond P value thresholds: Statistical significance does not equal clinical importance

Mary E. Putt

Summary: The statistical significance of a risk factor is influenced by sample size and the distributions of outcome and predictor variables. Paying closer attention to confidence intervals and visual displays can lead to a more comprehensive understanding of data analysis results.

CANCER (2021)

添加到收藏夹

Article Computer Science, Information Systems

An annotated association mining approach for extracting and visualizing interesting clinical events

Aashara Shrestha, Dimitrios Zikos, Leonidas Fegaras

Summary: This work aims to derive interesting clinical events using association rule mining based on a user-annotated order of clinical features. The plugin algorithm scans the database to calculate the support of item sequences in line with the user-annotated feature order. It generates rules efficiently and organizes them into meaningful hierarchies to unfold interesting clinical events.

INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Association rule mining using fuzzy logic and whale optimization algorithm

S. Sharmila, S. Vijayarani

Summary: Association rule mining is a well-known data mining scheme used to discover commonly co-occurred itemsets, with frequent item recognition and association rule generation being key steps. Various algorithms have been developed by researchers to generate association rules, with fuzzy logic incorporated for uncovering recurrent itemsets and interesting fuzzy association rules. Dimensionality reduction techniques are proposed to effectively identify significant transactions and items from databases, while the efficiency of the proposed algorithm is compared with other optimization techniques for frequent item identification and rule generation.

SOFT COMPUTING (2021)

添加到收藏夹

Article Automation & Control Systems

Classification Technique and its Combination with Clustering and Association Rule Mining in Educational Data Mining-A survey

Sunita M. Dol, Pradip M. Jawandhiya

Summary: Educational data mining (EDM) applies data mining techniques in the field of education to classify, analyze, and predict students' academic performance, dropout rate, and instructors' performance. This review article analyzes 142 research articles from 2010 to 2020 and discusses the current developments in EDM in 2021 and 2022. It presents the use of classification techniques, clustering algorithms, association rule algorithms, regression techniques, and ensemble techniques in EDM. The article also compares different classification techniques and identifies research gaps for future improvement in the teaching-learning process.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2023)

添加到收藏夹

Article Medicine, General & Internal

Misinterpretations of Significance Testing Results Near the P-Value Threshold in the Urologic Literature

Pranay R. Manda, Manish Kuchakulla, Gabrielle Hochu, Pranav Mudiam, Arjun Watane, Ali Syed, Armin Ghomeshi, Ranjith Ramasamy

Summary: This study evaluated abstracts from 15 urology journals published between 2000 and 2021 and found a common statistical mistake of misconstruing non-significant data as trending toward significance. The word "trend" was used 572 times to describe such non-statistically significant data. There was a statistically significant difference in the error rates between different journals, and there was a moderate correlation between the number of articles published and the frequency of misuses of the word "trend".

CUREUS JOURNAL OF MEDICAL SCIENCE (2023)

添加到收藏夹

Article Engineering, Industrial

Pattern investigation of total loss maritime accidents based on association rule mining

He Lan, Xiaoxue Ma, Laihao Ma, Weiliang Qiao

Summary: Total loss of a ship is the most serious consequence of maritime accidents, causing massive property losses, human casualties, and environmental pollution. This study investigates significant patterns in total loss accidents using association rule technique and finds that ship age and accident type are key indicators.

RELIABILITY ENGINEERING & SYSTEM SAFETY (2023)

添加到收藏夹

Article Green & Sustainable Science & Technology

Association Rule Mining-Based Generalized Growth Mode Selection: Maximizing the Value of Retired Mechanical Parts

Yuyao Guo, Lei Wang, Zelin Zhang, Jianhua Cao, Xuhui Xia

Summary: Due to the inability to restore the original performance, retired mechanical products are often replaced and discarded or recycled, resulting in energy waste and decreased residual value. The generalized growth remanufacturing model (GGRM) offers a solution to enhance residual value by incorporating a wider range of growth modes. However, suitable methods for selecting growth modes in GGRM are limited. Therefore, we propose a growth mode selection method based on association rule mining and conduct a case study to demonstrate its feasibility, efficiency, and accuracy.

SUSTAINABILITY (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Fast Top-K association rule mining using rule generation property pruning

Xiangyu Liu, Xinzheng Niu, Philippe Fournier-Viger

Summary: This study introduces a new algorithm called FTARM, which efficiently finds the top-k association rules using Rule Generation Property Pruning and a novel candidate pruning property, leading to significant reductions in association rule mining time and memory usage. FTARM exhibits good scalability and can benefit various applications.

APPLIED INTELLIGENCE (2021)

添加到收藏夹

Article Meteorology & Atmospheric Sciences

Testing for Trends on a Regional Scale: Beyond Local Significance

Radan Huth, Martin Dubrovsky

Summary: This study focuses on the statistical significance of trends in climate elements defined at a regional scale, comparing different detection methods. The sign test and extended Mann-Kendall test perform slightly better under low autocorrelation conditions, while all tests show similar performance under high autocorrelation conditions.

JOURNAL OF CLIMATE (2021)

添加到收藏夹

Article Engineering, Environmental

Analysing the effects of culture parameters on wastewater treatment capability of microalgae through association rule mining

Vishal Singh, Vishal Mishra

Summary: Association rule mining was used in this study to identify specific conditions for enhancing microalgae growth in wastewater, including CO2 content, light intensity, initial inoculum level, and N/P ratio. The general rules derived from this mining process showed that optimizing these parameters can increase biomass productivity and nutrient removal efficiency. These findings are important for future experimental design and large-scale implementation of microalgae-based wastewater treatment process.

JOURNAL OF ENVIRONMENTAL CHEMICAL ENGINEERING (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

MICAR: nonlinear association rule mining based on maximal information coefficient

Maidi Liu, Zhiwei Yang, Yong Guo, Jiang Jiang, Kewei Yang

Summary: Association rule mining (ARM) is an important research topic in data mining and knowledge discovery. This paper proposes a nonlinear ARM method called MICAR based on the maximal information coefficient (MIC), which can effectively extract high-quality positive and negative association rules, especially nonlinear association rules.

KNOWLEDGE AND INFORMATION SYSTEMS (2022)

添加到收藏夹

Article Biochemistry & Molecular Biology

A tissue-specific collaborative mixed model for jointly analyzing multiple tissues in transcriptome-wide association studies

Xingjie Shi, Xiaoran Chai, Yi Yang, Qing Cheng, Yuling Jiao, Haoyue Chen, Jian Huang, Can Yang, Jin Liu

NUCLEIC ACIDS RESEARCH (2020)

添加到收藏夹

Article Genetics & Heredity

Accurate genetic and environmental covariance estimation with composite likelihood in genome-wide association studies

Boran Gao, Can Yang, Jin Liu, Xiang Zhou

Summary: The new computational method GECKO improves the accuracy of estimating genetic and environmental covariances in GWAS, revealing shared genetic and environmental structures between traits and aiding in the investigation of causal relationships. Compared to traditional methods, GECKO provides more accurate estimates and identifies significant genetic and environmental covariances, demonstrating a twofold power gain in analyzing trait pairs.

PLOS GENETICS (2021)

添加到收藏夹

Article Biochemical Research Methods

XPXP: improving polygenic prediction by cross-population and cross-phenotype analysis

Jiashun Xiao, Mingxuan Cai, Xianghong Hu, Xiang Wan, Gang Chen, Can Yang

Summary: This article presents a cross-population and cross-phenotype method for constructing accurate polygenic risk scores (PRSs) in under-represented populations. By leveraging datasets from European populations and genetically correlated phenotypes, this method improves the accuracy of PRSs in non-European populations and enhances disease prediction and prevention in personalized medicine.

BIOINFORMATICS (2022)

添加到收藏夹

Article Biochemical Research Methods

Significance-Based Essential Protein Discovery

Yan Liu, Hao Liang, Quan Zou, Zengyou He

Summary: The identification of essential proteins is an important problem in bioinformatics. Existing methods have limitations in providing context-free and easily interpretable quantifications of centrality values, specifying proper thresholds, and controlling the quality of reported essential proteins. To overcome these limitations, this study formulates the essential protein discovery problem as a multiple hypothesis testing problem and presents a significance-based method named SigEP. Experimental results demonstrate that SigEP outperforms competing algorithms.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2022)

添加到收藏夹

Article Biochemical Research Methods

Essential Protein Recognition via Community Significance

Yan Liu, Wenfang Chen, Zengyou He

Summary: The study introduces a new significance-based essential protein recognition method named EPCS, which outperforms current state-of-the-art essential protein identification methods and the only significance-based method SigEP.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2021)

添加到收藏夹

Article Multidisciplinary Sciences

On the statistical significance of communities from weighted graphs

Zengyou He, Wenfang Chen, Xiaoqi Wei, Yan Liu

Summary: Community detection is a fundamental procedure in analyzing network data and the definition of a community remains a topic of debate. This study presents a new formulation for testing the realness of communities in weighted networks by modeling edge-weights as censored observations. By conducting Logrank tests on internal and external weight sets, the method outperforms existing evaluation metrics in individual community validation.

SCIENTIFIC REPORTS (2021)

添加到收藏夹

Article Mathematical & Computational Biology

scPI: A Scalable Framework for Probabilistic Inference in Single-Cell RNA-Sequencing Data Analysis

Jingsi Ming, Jia Zhao, Can Yang

Summary: The technique of single-cell RNA-sequencing has allowed researchers to explore the cellular heterogeneity of complex tissues. In this study, a scalable framework called scPI was proposed to analyze scRNA-seq data. The scPI framework utilizes amortized variational inference and a nonlinear neural network to infer the low-dimensional representations of the data. Through analysis of real datasets, it was demonstrated that scPI can effectively handle various probabilistic models for scRNA-seq data in terms of scalability, missing value imputation, and cell type clustering.

STATISTICS IN BIOSCIENCES (2023)

添加到收藏夹

Article Multidisciplinary Sciences

Mendelian randomization for causal inference accounting for pleiotropy and sample structure using genome-wide summary statistics

Xianghong Hu, Jia Zhao, Zhixiang Lin, Yang Wang, Heng Peng, Hongyu Zhao, Xiang Wan, Can Yang

Summary: Mendelian randomization (MR) is a valuable tool for inferring causal relationships among traits using summary statistics from GWASs, but existing methods often rely on strong assumptions leading to false-positive findings. Research has shown that considering pleiotropy and sample structure is crucial for reducing confounding effects.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2022)

添加到收藏夹

Letter Biochemistry & Molecular Biology

Organoid-based single-cell spatiotemporal gene expression landscape of human embryonic development and hematopoiesis

Yiming Chao, Yang Xiang, Jiashun Xiao, Weizhong Zheng, Mo R. Ebrahimkhani, Can Yang, Angela Ruohao Wu, Pentao Liu, Yuanhua Huang, Ryohichi Sugimura

SIGNAL TRANSDUCTION AND TARGETED THERAPY (2023)

添加到收藏夹

Article Biochemical Research Methods

PALM: a powerful and adaptive latent model for prioritizing risk variants with functional annotations

Xinyi Yu, Jiashun Xiao, Mingxuan Cai, Yuling Jiao, Xiang Wan, Jin Liu, Can Yang

Summary: The findings from genome-wide association studies have greatly helped us understand the genetic basis of human complex traits and diseases. However, several major challenges still need to be addressed, including the unknown biological functions of most GWAS hits and the identification of genetic risk variants with weak effects. To overcome these challenges, we propose a powerful and adaptive latent model (PALM) that integrates functional annotations with GWAS summary statistics.

BIOINFORMATICS (2023)

添加到收藏夹

Article Biochemical Research Methods

stVAE deconvolves cell-type composition in large-scale cellular resolution spatial transcriptomics

Chen Li, Ting-Fung Chan, Can Yang, Zhixiang Lin

Summary: The study introduces a method called stVAE, based on the variational autoencoder framework, to deconvolve the cell-type composition of cellular resolution spatial transcriptomic datasets. It accurately identifies spatial patterns of cell types and their relative proportions across spots.

BIOINFORMATICS (2023)

添加到收藏夹

Article Materials Science, Multidisciplinary

Ultralong mean free path phonons in HKUST-1 and their scattering by water adsorbates

Hongzhao Fan, Can Yang, Yanguang Zhou

Summary: Metal-organic frameworks (MOFs) have shown potential in energy storage and thermal management. By studying HKUST-1, a typical MOF, we found that its thermal conductivity is strongly size dependent, but decreases when water molecules are adsorbed. We also discovered two thermal energy exchange pathways in HKUST-1 with water molecules, and the thermal conductivity varies with the quantity of adsorbates due to the competition between these pathways.

PHYSICAL REVIEW B (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Detecting Statistically Significant Communities

Zengyou He, Hao Liang, Zheng Chen, Can Zhao, Yan Liu

Summary: Community detection is a key data analysis problem, and many algorithms have been proposed. However, most work does not consider statistical significance. This article presents a tight upper bound on the p-value of a single community and a local search method for detecting statistically significant communities. Experimental results show its comparability with other methods.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A graph-traversal approach to identify influential nodes in a network

Yan Liu, Xiaoqi Wei, Wenfang Chen, Lianyu Hu, Zengyou He

Summary: This method utilizes a breadth-first search tree to generate a curve for calculating the influence score of nodes, demonstrating superiority over widely used centrality measures in various network domains.

PATTERNS (2021)

添加到收藏夹

Article Computer Science, Information Systems

Instance-Based Classification Through Hypothesis Testing

Zengyou He, Chaohua Sheng, Yan Liu, Quan Zou

Summary: This paper presents a generic framework that formulates the binary classification problem as a two-sample testing problem, which is based on instances and hypothesis testing. Experimental results show that the method achieves performance comparable to classic classifiers and outperforms existing testing-based classifiers.

IEEE ACCESS (2021)

添加到收藏夹

Article Computer Science, Information Systems

A consensus model considers managing manipulative and overconfident behaviours in large-scale group decision-making

Xia Liang, Jie Guo, Peide Liu

Summary: This paper investigates a novel consensus model based on social networks to manage manipulative and overconfident behaviors in large-scale group decision-making. By proposing a novel clustering model and improved methods, the consensus reaching is effectively facilitated. The feedback mechanism and management approach are employed to handle decision makers' behaviors. Simulation experiments and comparative analysis demonstrate the effectiveness of the model.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

CGN: Class gradient network for the construction of adversarial samples

Xiang Li, Haiwang Guo, Xinyang Deng, Wen Jiang

Summary: This paper proposes a method based on class gradient networks for generating high-quality adversarial samples. By introducing a high-level class gradient matrix and combining classification loss and perturbation loss, the method demonstrates superiority in the transferability of adversarial samples on targeted attacks.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

Distinguishing latent interaction types from implicit feedbacks for recommendation

Lingyun Lu, Bang Wang, Zizhuo Zhang, Shenghao Liu

Summary: Many recommendation algorithms only rely on implicit feedbacks due to privacy concerns. However, the encoding of interaction types is often ignored. This paper proposes a relation-aware neural model that classifies implicit feedbacks by encoding edges, thereby enhancing recommendation performance.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

Proximity-based density description with regularized reconstruction algorithm for anomaly detection

Jaehong Yu, Hyungrok Do

Summary: This study discusses unsupervised anomaly detection using one-class classification, which determines whether a new instance belongs to the target class by constructing a decision boundary. The proposed method uses a proximity-based density description and a regularized reconstruction algorithm to overcome the limitations of existing one-class classification methods. Experimental results demonstrate the superior performance of the proposed algorithm.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

Non-iterative border-peeling clustering algorithm based on swap strategy

Hui Tu, Shifei Ding, Xiao Xu, Haiwei Hou, Chao Li, Ling Ding

Summary: Border-Peeling algorithm is a density-based clustering algorithm, but its complexity and issues on unbalanced datasets restrict its application. This paper proposes a non-iterative border-peeling clustering algorithm, which improves the clustering performance by distinguishing and associating core points and border points.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

A two-stage denoising framework for zero-shot learning with noisy labels

Long Tang, Pan Zhao, Zhigeng Pan, Xingxing Duan, Panos M. Pardalos

Summary: In this work, a two-stage denoising framework (TSDF) is proposed for zero-shot learning (ZSL) to address the issue of noisy labels. The framework includes a tailored loss function to remove suspected noisy-label instances and a ramp-style loss function to reduce the negative impact of remaining noisy labels. In addition, a dynamic screening strategy (DSS) is developed to efficiently handle the nonconvexity of the ramp-style loss.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

Selection of a viable blockchain service provider for data management within the internet of medical things: An MCDM approach to Indian healthcare

Raghunathan Krishankumar, Sundararajan Dhruva, Kattur S. Ravichandran, Samarjit Kar

Summary: Health 4.0 is gaining global attention for better healthcare through digital technologies. This study proposes a new decision-making framework for selecting viable blockchain service providers in the Internet of Medical Things (IoMT). The framework addresses the limitations in previous studies and demonstrates its applicability in the Indian healthcare sector. The results show the top ranking BSPs, the importance of various criteria, and the effectiveness of the developed model.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

Q-learning with heterogeneous update strategy

Tao Tan, Hong Xie, Liang Feng

Summary: This paper proposes a heterogeneous update idea and designs HetUp Q-learning algorithm to enlarge the normalized gap by overestimating the Q-value corresponding to the optimal action and underestimating the Q-value corresponding to the other actions. To address the limitation, a softmax strategy is applied to estimate the optimal action, resulting in HetUpSoft Q-learning and HetUpSoft DQN. Extensive experimental results show significant improvements over SOTA baselines.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

Dyformer: A dynamic transformer-based architecture for multivariate time series classification

Chao Yang, Xianzhi Wang, Lina Yao, Guodong Long, Guandong Xu

Summary: This paper proposes a dynamic transformer-based architecture called Dyformer for multivariate time series classification. Dyformer captures multi-scale features through hierarchical pooling and adaptive learning strategies, and improves model performance by introducing feature-map-wise attention mechanisms and a joint loss function.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

ESSENT: an arithmetic optimization algorithm with enhanced scatter search strategy for automated test case generation

Xiguang Li, Baolu Feng, Yunhe Sun, Ammar Hawbani, Saeed Hammod Alsamhi, Liang Zhao

Summary: This paper proposes an enhanced scatter search strategy, using opposition-based learning, to solve the problem of automated test case generation based on path coverage (ATCG-PC). The proposed ESSENT algorithm selects the path with the lowest path entropy among the uncovered paths as the target path and generates new test cases to cover the target path by modifying the dimensions of existing test cases. Experimental results show that the ESSENT algorithm outperforms other state-of-the-art algorithms, achieving maximum path coverage with fewer test cases.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

An attention based approach for automated account linkage in federated identity management

Shirin Dabbaghi Varnosfaderani, Piotr Kasprzak, Aytaj Badirova, Ralph Krimmel, Christof Pohl, Ramin Yahyapour

Summary: Linking digital accounts belonging to the same user is crucial for security, user satisfaction, and next-generation service development. However, research on account linkage is mainly focused on social networks, and there is a lack of studies in other domains. To address this, we propose SmartSSO, a framework that automates the account linkage process by analyzing user routines and behavior during login processes. Our experiments on a large dataset show that SmartSSO achieves over 98% accuracy in hit-precision.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

A memetic algorithm with fuzzy-based population control for the joint order batching and picker routing problem

Renchao Wu, Jianjun He, Xin Li, Zuguo Chen

Summary: This paper proposes a memetic algorithm with fuzzy-based population control (MA-FPC) to solve the joint order batching and picker routing problem (JOBPRP). The algorithm incorporates batch exchange crossover and a two-level local improvement procedure. Experimental results show that MA-FPC outperforms existing algorithms in terms of solution quality.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

Refining one-class representation: A unified transformer for unsupervised time-series anomaly detection

Guoxiang Zhong, Fagui Liu, Jun Jiang, Bin Wang, C. L. Philip Chen

Summary: In this study, we propose the AMFormer framework to address the problem of mixed normal and anomaly samples in deep unsupervised time-series anomaly detection. By refining the one-class representation and introducing the masked operation mechanism and cost sensitive learning theory, our approach significantly improves anomaly detection performance.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

A data-driven optimisation method for a class of problems with redundant variables and indefinite objective functions

Jin Zhou, Kang Zhou, Gexiang Zhang, Ferrante Neri, Wangyang Shen, Weiping Jin

Summary: In this paper, the authors focus on the issue of multi-objective optimisation problems with redundant variables and indefinite objective functions (MOPRVIF) in practical problem-solving. They propose a dual data-driven method for solving this problem, which consists of eliminating redundant variables, constructing objective functions, selecting evolution operators, and using a multi-objective evolutionary algorithm. The experiments conducted on two different problem domains demonstrate the effectiveness, practicality, and scalability of the proposed method.

INFORMATION SCIENCES (2024)

添加到收藏夹

Article Computer Science, Information Systems

A Monte Carlo fuzzy logistic regression framework against imbalance and separation

Georgios Charizanos, Haydar Demirhan, Duygu Icen

Summary: This article proposes a new fuzzy logistic regression framework that addresses the problems of separation and imbalance while maintaining the interpretability of classical logistic regression. By fuzzifying binary variables and classifying subjects based on a fuzzy threshold, the framework demonstrates superior performance on imbalanced datasets.

INFORMATION SCIENCES (2024)

添加到收藏夹

© Peeref 2019-2024. All rights reserved.