Article
Computer Science, Artificial Intelligence
Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang
Summary: Existing literature on top-$k$k optimization mainly focuses on the optimization method of the top-$k$k objective, but neglects the limitations of the metric itself. To address this issue, a novel metric named partial Area Under the top-$k$k Curve (AUTKC) is proposed, which has better discrimination ability and does not allow irrelevant labels to appear in the top list. Experimental results on benchmark datasets validate the effectiveness of the proposed framework.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Environmental Sciences
Junying Cheng, Xiaoai Dai, Zekun Wang, Jingzhong Li, Ge Qu, Weile Li, Jinxing She, Youlin Wang
Summary: This study analyzed the landslide susceptibility in the Three Gorges Reservoir region of the Yangtze River using machine learning models. The results identified five categories of influencing factors and showed that SVM model performed the best in terms of generalization ability and robustness, making it suitable for real-time assessment of regional landslide susceptibility.
Article
Statistics & Probability
M. Mahdizadeh, Ehsan Zamanzade
Summary: The article introduces a method for estimating the effectiveness of a biomarker using multistage ranked set sampling design, and a nonparametric estimator using kernel density estimation is developed. Simulation studies show that this estimator can be more efficient than simple random sampling.
STATISTICAL PAPERS
(2021)
Article
Computer Science, Information Systems
Hong Zhang, Yifan Zhang
Summary: In this study, an improved Sparrow Search Algorithm Support Vector Machine (ISSA-SVM) algorithm is proposed to optimize the SVM kernel parameters. Experimental results show that ISSA has faster convergence, more accurate search capability, and easier to jump out of local extremes compared to SSA, GWO, and WOA algorithms. ISSA also shows better convergence, better robustness, and stronger competitiveness. The classification accuracy of ISSA-SVM algorithm is improved by 7.09% and 4.25% compared with SVM and SSA-SVM, respectively, on the coal gangue dataset. Meanwhile, the classification time is also reduced by 20.15% and 13.74% compared with SVM and SSA-SVM, respectively.
Article
Multidisciplinary Sciences
Wen Bo Liu, Sheng Nan Liang, Xi Wen Qin
Summary: In this paper, a dimension-reduction algorithm based on WKPCA is proposed to improve the classification performance of gene expression data by combining multiple kernel functions, constructing t-class kernel functions, and using various machine learning methods. The results show that WKPCA can effectively enhance the classification performance compared to other dimension reduction techniques.
Article
Pharmacology & Pharmacy
Lanyan Fang, Ramana Uppoor, Mingjiang Xu, Satish Sharan, Hao Zhu, Nilufer Tampal, Bing Li, Lei Zhang, Robert Lionberger, Liang Zhao
Summary: Assessing relative bioavailability or bioequivalence between two products based solely on peak drug concentration and total exposure may be insufficient, leading regulatory agencies to recommend the use of partial AUC as an additional exposure measure. The focus on quantifying exposures over specific time intervals and supporting the determination of relative BA or BE, as well as the rationale behind using pAUCs, are key aspects outlined in the FDA guidance for generic drug development.
CLINICAL PHARMACOLOGY & THERAPEUTICS
(2021)
Article
Computer Science, Artificial Intelligence
Pablo A. Jaskowiak, Ivan G. Costa, Ricardo J. G. B. Campello
Summary: This paper explores the use of AUC as a performance measure in the unsupervised learning domain, specifically in cluster analysis. It discusses the use of AUC as an internal/relative measure of clustering quality, referred to as AUCC, and shows that AUCC has an expected value under a null model of random clustering solutions. It also reveals that AUCC is a linear transformation of the Gamma criterion.
DATA MINING AND KNOWLEDGE DISCOVERY
(2022)
Article
Health Care Sciences & Services
Yingdong Feng, Lili Tian
Summary: This paper examines the potential misleading effects of commonly used pooling strategies in biomarker evaluation, proposing a new diagnostic framework and accuracy measures for such settings. Additionally, an ovarian cancer dataset is analyzed to demonstrate these concepts.
STATISTICAL METHODS IN MEDICAL RESEARCH
(2021)
Article
Pharmacology & Pharmacy
Andre J. Jackson, Henry C. Foehl
Summary: This study compared the performance of pAUC metrics in establishing bioequivalence using the standard crossover design versus a replicated design, and examined the relationship between pAUC metrics and pharmacodynamics metrics. The results showed that pAUC metrics were sensitive to changes in drug ratios and were highly correlated with pharmacodynamics metrics.
Article
Mathematical & Computational Biology
Dingding Hu, Meng Yuan, Tao Yu, Pengfei Li
Summary: This article mathematically interprets the application of the receiver operating characteristic (ROC) curve in medical research and proposes a method to estimate the ROC curve and associated summary statistics. The performance of the method is compared with competitive methods through numerical studies and illustrated with a real-data example.
STATISTICS IN MEDICINE
(2023)
Article
Computer Science, Software Engineering
Krzysztof Gajowniczek, Tomasz Zabkowski
Summary: This paper introduces a novel R package, ImbTreeAUC, for building binary and multiclass decision trees using the area under the ROC curve, which can handle imbalanced data and support cost-sensitive and weight-sensitive learning.
Article
Veterinary Sciences
Jinji Pang, Wangqian Ju, Michael Welch, Phillip Gauger, Peng Liu, Qijing Zhang, Chong Wang
Summary: Developing and evaluating novel diagnostic assays are crucial in diagnostic research. The ROC curve and AUC are commonly used to evaluate the performance of diagnostic assays. This paper proposes two novel methods, cluster bootstrapping and hierarchical bootstrapping, to calculate the confidence interval of the AUC for correlated diagnostic test data. Simulation studies show that these methods have higher coverage probabilities compared to the traditional method when there are intra-subject correlations.
FRONTIERS IN VETERINARY SCIENCE
(2023)
Article
Mathematics, Applied
Alexej Gossmann, Aria Pezeshk, Yu-Ping Wang, Berkman Sahiner
Summary: The performance evaluation of constantly evolving machine learning algorithms, especially in high-risk fields such as medicine, faces new challenges. Reusing the same test dataset can lead to overfitting and overly optimistic conclusions about algorithm performance. A modified holdout mechanism shows potential in reducing overfitting issues.
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE
(2021)
Article
Construction & Building Technology
Yifan Huang, Yu Lei, Xuedong Luo, Chao Fu
Summary: Concrete made from rice husk ash (RHA) is stronger and more durable than normal concrete, and can help reduce greenhouse gas emissions. In this study, the compressive strength of RHA concrete is predicted using support vector regression (SVR) combined with three optimization algorithms. The results show that all three optimization algorithms improve the prediction performance of SVR, with the firefly algorithm (FA) achieving the best results.
CASE STUDIES IN CONSTRUCTION MATERIALS
(2023)
Article
Mathematical & Computational Biology
Yougui Wu
Summary: This study explores the optimal two-phase sampling design for evaluating the performance of an ordinal test in classifying disease status. Simulation results show that two-phase sampling under optimal probabilities can substantially reduce the variance of the AUC estimator, with oversampling of subjects at low and high ordinal levels. Compared to proportional allocation, this approach improves efficiency.
STATISTICS IN MEDICINE
(2021)