Article
Computer Science, Information Systems
Roberto Carlos Morales-Hernandez, Joaquin Gutierrez Jaguey, David Becerra-Alonso
Summary: The classification of scientific articles aligned to Sustainable Development Goals is crucial for research institutions and universities. This study applies Natural Language Processing techniques to classify articles according to the 17 Sustainable Development Goals, comparing the performance of different text classification models.
Article
Computer Science, Information Systems
Jie Xiong, Li Yu, Xi Niu, Youfang Leng
Summary: This paper proposes a novel two-stage XMTC framework, XRR, to address the drawbacks of existing methods. In the retrieving stage, two retrieval strategies are designed to extract candidates from massive labels. In the ranking stage, a deep ranking model using a pre-trained transformer is presented to distinguish true labels from candidates. Extensive experiments show that XRR outperforms state-of-the-art methods on five widely used multi-label datasets.
INFORMATION SCIENCES
(2023)
Article
Computer Science, Information Systems
Qing Wang, Jia Zhu, Hongji Shu, Kwame Omono Asamoah, Jianyang Shi, Cong Zhou
Summary: Extreme multi-label text classification (XMTC) is an emerging and essential task in natural language processing. This paper proposes a novel guide network (GUDN) with a label reinforcement strategy based on label semantics to help fine-tune pre-trained models for classification. Experimental results demonstrate that GUDN outperforms state-of-the-art methods on Eurlex-4k and achieves competitive results on other popular datasets. Additionally, it is found that meaningless tokens can harm the classification accuracy of Transformer-based models in another experiment.
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES
(2023)
Article
Chemistry, Analytical
Wenfu Liu, Jianmin Pang, Qiming Du, Nan Li, Shudan Yang
Summary: Short text representation is a basic and key task in natural language processing. The traditional method of merging bag-of-words model and topic model may lead to ambiguity in semantic information and sparse topic information, while the proposed method involves fusing word embeddings and extended topic information to highlight important semantic information and improve short text representation capabilities. Testing results verify the effectiveness of the approach.
Article
Computer Science, Artificial Intelligence
Wenfu Liu, Jianmin Pang, Nan Li, Xin Zhou, Feng Yue
Summary: Single-label classification technology is insufficient for text classification, leading to the importance of multi-label text classification in NLP. This paper introduces a method based on tALBERT-CNN which utilizes LDA topic model and ALBERT model to extract semantic features and train a multi-label classifier, outperforming existing algorithms in performance on standard datasets.
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS
(2021)
Article
Computer Science, Artificial Intelligence
Yinglong Ma, Xiaofeng Liu, Lijiao Zhao, Yue Liang, Peng Zhang, Beihong Jin
Summary: This paper introduces a hierarchical multi-label text classification method based on hybrid embedding, combining graph embedding and word embedding; using a level-by-level HMTC approach and conducting extensive experiments on five large-scale real-world datasets, the results show that the method is competitive in classification accuracy.
EXPERT SYSTEMS WITH APPLICATIONS
(2022)
Article
Computer Science, Artificial Intelligence
Boyan Wang, Xuegang Hu, Peipei Li, Philip S. Yu
Summary: The paper proposes a unified framework, Hierarchical Cognitive Structure Learning Model (HCSM), for handling hierarchical multi-label text classification (HMLTC) tasks. This model comprehensively utilizes partial new knowledge and global hierarchical label structure, demonstrating superior performance in experimental results on four benchmark datasets.
KNOWLEDGE-BASED SYSTEMS
(2021)
Article
Computer Science, Artificial Intelligence
Xinyi Zhang, Jiahao Xu, Charlie Soh, Lihui Chen
Summary: Hierarchical multi-label text classification (HMTC) has gained popularity in real-world applications. The proposed LA-HCN model outperforms other state-of-the-art neural network algorithms in HMTC, providing explainability by visualizing learned attention and extracting meaningful information corresponding to different labels.
EXPERT SYSTEMS WITH APPLICATIONS
(2022)
Article
Computer Science, Artificial Intelligence
Yuefeng Liang, Cho-Jui Hsieh, Thomas C. M. Lee
Summary: Extreme multi-label classification aims to learn a classifier that annotates instances with relevant labels from a large label set. Existing approaches have high computational costs, but our proposed Block-wise Partitioning method reduces prediction time without sacrificing prediction accuracy. Experimental results on benchmark data sets demonstrate the effectiveness of the BP pretreatment.
DATA MINING AND KNOWLEDGE DISCOVERY
(2023)
Article
Computer Science, Artificial Intelligence
Rui Song, Zelong Liu, Xingbing Chen, Haining An, Zhiqi Zhang, Xiaoguang Wang, Hao Xu
Summary: Multi-label text classification is a focus of research due to its practical significance. This paper introduces the Label Prompt Multi-label Text Classification model (LP-MTC), which integrates labels into a pre-trained language model and optimizes it with Masked Language Models. This approach effectively captures correlations among labels and improves model performance. Empirical experiments demonstrate its effectiveness, with LP-MTC achieving an average 3.4% improvement in micro-F1 compared to BERT on four public datasets.
APPLIED INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Yaoqiang Xiao, Yi Li, Jin Yuan, Songrui Guo, Yi Xiao, Zhiyong Li
Summary: This paper introduces a history-based attention mechanism for multi-label text classification task, which effectively explores information representations for label predictions by considering historical information, boosting performance and consistently outperforming competitive approaches.
KNOWLEDGE-BASED SYSTEMS
(2021)
Article
Computer Science, Artificial Intelligence
Huiting Liu, Geng Chen, Peipei Li, Peng Zhao, Xindong Wu
Summary: In this paper, a multi-label text classification algorithm LELC is proposed based on multi-layer attention and label correlation, which utilizes BiGRU network for feature extraction and matrix factorization for label space dimension reduction. Finally, Deep Canonical Correlation Analysis (CCA) technology is employed to couple features and the latent space, achieving comparable results with existing methods.
Article
Computer Science, Artificial Intelligence
Katarzyna Poczeta, Miroslaw Plazaa, Tomasz Michno, Maria Krechowicz, Michal Zawadzki
Summary: This paper presents a system for multi-label classification of text data in Call/Contact Centre systems. The proposed approach allows automatic routing of contents to agents with different competences, which is an innovation and advantage over existing solutions. It combines vectorization methods, dimensionality reduction methods, and a classifier based on artificial neural networks. The effectiveness of the developed method was evaluated using data from real CC systems and the Stackoverflow database, and compared with other classification methods.
APPLIED SOFT COMPUTING
(2023)
Article
Computer Science, Artificial Intelligence
Bin -Bin Jia, Min -Ling Zhang
Summary: In this paper, a new classification framework called Multi-Dimensional Multi-Label (MDML) classification is investigated, which models objects with rich semantics by encompassing heterogeneous label spaces and multi-label annotations. A learning method named CLIM is proposed to learn from MDML training examples. CLIM induces base multi-label predictive models w.r.t. each label space and uses thresholding predictions to augment the original feature space and yield stacked multi-label predictive models. Experiments on real-world MDML data sets validate the effectiveness of CLIM.
PATTERN RECOGNITION
(2023)
Article
Computer Science, Artificial Intelligence
Hosein Azarbonyad, Mostafa Dehghani, Maarten Marx, Jaap Kamps
Summary: Efficiently exploiting various sources of information has a significant impact on the performance of Multi-Label Text Classification systems. Integrating all features using a learning to rank approach can result in improved performance. Additionally, the titles of documents are found to be more informative than other sources, and leveraging co-occurrence information of classes can enhance document classification accuracy.
NATURAL LANGUAGE ENGINEERING
(2021)
Article
Computer Science, Artificial Intelligence
Ce Gao, Jiangtao Ren
Article
Computer Science, Artificial Intelligence
Jiangtao Ren, Jiawei Long, Zhikang Xu
DECISION SUPPORT SYSTEMS
(2019)
Article
Computer Science, Information Systems
Chaotao Chen, Run Zhuo, Jiangtao Ren
INFORMATION SCIENCES
(2019)
Article
Computer Science, Interdisciplinary Applications
Mingwang Yin, Chengjie Mou, Kaineng Xiong, Jiangtao Ren
JOURNAL OF BIOMEDICAL INFORMATICS
(2019)
Article
Computer Science, Artificial Intelligence
Jiawei Long, Zhaopeng Chen, Weibing He, Taiyu Wu, Jiangtao Ren
APPLIED SOFT COMPUTING
(2020)
Article
Computer Science, Artificial Intelligence
Qianlong Wang, Jiangtao Ren
Summary: The article proposes a model for improving social media short text abstractive summarization by focusing on summary-aware attention. Experimental results show that the model achieved significant improvements on a popular Chinese social media dataset.
Article
Computer Science, Interdisciplinary Applications
Zhaoning Li, Jiangtao Ren
JOURNAL OF BIOMEDICAL INFORMATICS
(2020)
Article
Computer Science, Artificial Intelligence
Chengjie Mou, Jiangtao Ren
ARTIFICIAL INTELLIGENCE IN MEDICINE
(2020)
Article
Computer Science, Information Systems
Jiangtao Ren, Naiyin Liu, Xiaojing Wu
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS
(2020)
Article
Computer Science, Artificial Intelligence
Naiyin Liu, Qianlong Wang, Jiangtao Ren
Summary: A Label-Embedding Bi-directional Attentive model is proposed in this paper to enhance the performance of BERT's text classification framework, and experimental results show notable improvements over baselines and state-of-the-art models on five datasets.
NEURAL PROCESSING LETTERS
(2021)
Article
Engineering, Biomedical
Nanya Chen, Jiangtao Ren
Summary: The use of Electronic Health Records (EHR) in medical artificial intelligence is a significant research field. However, the data quality of EHR is hindered by its primary purpose of recording patient disease information rather than research. This paper proposes an EHR data quality evaluation approach based on clinical evidence and a deep text matching model. Experimental results show that this approach can effectively distinguish high-quality EHR from low-quality EHR.
Proceedings Paper
Computer Science, Artificial Intelligence
Qianlong Wang, Jiangtao Ren
ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE
(2020)
Article
Computer Science, Artificial Intelligence
Ziheng Chen, Chaojie Lai, Jiangtao Ren
APPLIED SOFT COMPUTING
(2020)
Proceedings Paper
Computer Science, Artificial Intelligence
Hongli Wang, Jiangtao Ren
2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019)
(2019)