Article
Computer Science, Information Systems
Kurniawati Azizah, Wisnu Jatmiko
Summary: This study proposes a novel training strategy and speech synthesis model to address the issues of data scarcity in low-resource languages and unsatisfactory performance in zero-shot speaker adaptation. Through the use of multi-stage transfer learning and explicit style control, the proposed model successfully improves the intelligibility of synthesized speech and speaker similarity.
Article
Computer Science, Artificial Intelligence
Keqiuyin Li, Jie Lu, Hua Zuo, Guangquan Zhang
Summary: Transfer learning techniques leverage knowledge from similar domains to tackle tasks in a target domain. The proposed method in this article simultaneously learns similarities and diversities of domains to improve the transferability of latent features, aiming at enhancing the performance of the final target predictor.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2022)
Article
Computer Science, Information Systems
Sang-woo Lee, Ryong Lee, Min-seok Seo, Jong-chan Park, Hyeon-cheol Noh, Jin-gi Ju, Rae-young Jang, Gun-woo Lee, Myung-seok Choi, Dong-geol Choi
Summary: Multi-task learning is an efficient method to tackle multiple tasks with a single model, but recent approaches struggle to outperform single-task learning. This study validates the effectiveness of MTL in low-data conditions and proposes a feature filtering module with minimal overheads. Empirical results demonstrate that MTL can significantly enhance performance under low-data conditions for all tasks.
Article
Engineering, Electrical & Electronic
Dinghao Fan, Hengjie Lu, Shugong Xu, Shan Cao
Summary: This study introduces an end-to-end multi-task learning framework that utilizes depth modality to enhance the accuracy of gesture recognition. Experimental results demonstrate that the proposed method outperforms existing gesture recognition frameworks on three public datasets, and also achieves excellent accuracy improvement when applied to other 2D CNN-based frameworks.
IEEE SENSORS JOURNAL
(2021)
Article
Acoustics
Mao-Kui He, Jun Du, Qing-Feng Liu, Chin-Hui Lee
Summary: This paper proposes a neural speaker diarization (NSD) network architecture that improves speaker separation through multiple key components. The proposed method outperforms other techniques in realistic operating scenarios.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
(2023)
Article
Computer Science, Artificial Intelligence
Yuzhe Ma, Xufeng Yao, Ran Chen, Ruiyu Li, Xiaoyong Shen, Bei Yu
Summary: Domain adaptation is a promising method to reduce data labeling costs in the era of deep learning. This study integrates partial domain adaptation and model compression into a unified training process. By minimizing a differentiable soft-weighted maximum mean discrepancy, it reduces cross-domain distribution divergence and compresses overparameterized models using gradient statistics.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2022)
Article
Computer Science, Information Systems
Yida Zhu, Haiyong Luo, Song Guo, Fang Zhao
Summary: Human activity recognition (HAR) based on wearable sensors is a popular research topic. Obtaining labeled human activity data for different body-worn positions is expensive and labor-intensive, leading to poor performance of HAR models on different body positions. In this article, we propose a deep multiscale transfer learning (DMSTL) model for accurate HAR with low labeling cost. The model includes an unsupervised source selection method, a multiscale spatial-temporal Net (MSSTNet) for comprehensive multimodal representations, and category-level adaptation and domain-level adversarial modules for learning domain-invariant features. Experimental results on three public HAR datasets show that DMSTL outperforms other baselines.
IEEE INTERNET OF THINGS JOURNAL
(2023)
Article
Ecology
Ali Seydi Keceli, Aydin Kaya, Cagatay Catal, Bedir Tekinerdogan
Summary: The manual prediction of plant species and diseases is costly and time-consuming, and expertise may not always be available. Automated approaches, such as machine learning and deep learning, are being used to overcome these challenges. This study proposes a novel multi-task learning approach that combines plant species and disease prediction tasks using shared representations. The results show that this approach improves efficiency and learning speed.
ECOLOGICAL INFORMATICS
(2022)
Article
Biology
Weibai Pan, Ying An, Yuxia Guan, Jianxin Wang
Summary: This paper proposes a multi-task channel attention network (MCA-net) for myocardial infarction (MI) detection and location using 12-lead electrocardiograms (ECGs). By integrating features from different leads and introducing a multi-task framework, the MCA-net outperforms state-of-the-art methods in terms of accuracy. It effectively assists cardiologists in diagnosing and locating MI.
COMPUTERS IN BIOLOGY AND MEDICINE
(2022)
Article
Computer Science, Artificial Intelligence
Lu Wen, Jianghong Xiao, Jie Zeng, Chen Zu, Xi Wu, Jiliu Zhou, Xingchen Peng, Yan Wang
Summary: Recently, deep learning has made significant progress in automating radiation therapy planning and improving its quality and efficiency. However, this progress requires a large amount of clinical data. For low-incidence cancers like cervical cancer, where limited data is available, current data-hungry deep models fail to achieve satisfactory performance. In this paper, we propose a transfer learning approach to transfer knowledge from rectum cancer to cervical cancer for dose map prediction. Our method utilizes a two-phase paradigm and two specialized modules to overcome the negative transferring problem and achieve exemplary performance.
PATTERN RECOGNITION
(2023)
Article
Computer Science, Interdisciplinary Applications
Riaan Zoetmulder, Efstratios Gavves, Matthan Caan, Henk Marquering
Summary: This study evaluates the influence of different source task and domain combinations on the performance of transfer learning-based medical image segmentation tasks. The results show that CNNs pre-trained on a segmentation task on the same domain as the target tasks have higher or similar segmentation accuracy. In addition, pre-training CNNs on ImageNet does not necessarily result in higher lesion detection rates.
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE
(2022)
Article
Computer Science, Artificial Intelligence
Ram Krishn Mishra, Siddhaling Urolagin, J. Angel Arul Jothi, Pramod Gaur
Summary: Image processing is a technique used to apply various operations to images to improve them or extract information, with facial recognition being a prominent application. This study examines the accuracy of categorizing human facial expressions using deep learning and transfer learning methods, proposing a deep hybrid learning approach that combines multiple deep learning models.
IMAGE AND VISION COMPUTING
(2022)
Article
Computer Science, Artificial Intelligence
Zhongfeng Kang, Bo Yang, Mads Nielsen, Lihui Deng, Shantian Yang
Summary: Online transfer learning (OTL) is a method for handling transfer learning tasks where target domain data arrives in an online manner. However, existing OTL algorithms are limited by shallow models and only utilizing the latest instances. To overcome these limitations, this paper proposes a buffered online transfer learning (BOTL) algorithm that utilizes deep learning models and incorporates previously arrived instances.
Article
Computer Science, Artificial Intelligence
S. Nazmi Diker, C. Okan Sakar
Summary: This paper introduces a new application for text-to-SQL studies where users can create database models from natural language, and presents the first dataset for this task. The authors propose a framework consisting of three modular components to predict column data types and constraints, establish foreign key relationships between tables, and generate CREATE queries. They evaluate various baseline models and demonstrate the importance of contextualized word representations in classifying column data types and constraints. The experiments also show that a multi-task BERT model effectively addresses the training time and model size issues.
KNOWLEDGE-BASED SYSTEMS
(2023)
Article
Computer Science, Information Systems
Xiaoxi He, Xu Wang, Zimu Zhou, Jiahang Wu, Zheng Yang, Lothar Thiele
Summary: Future mobile devices are expected to have the ability to perceive, understand, and react to the world independently using deep neural networks. However, these models need to be compressed to fit in mobile storage and memory. This work proposes Multi-Task Zipping (MTZ), a framework that automatically merges pre-trained deep neural networks to reduce redundancy across multiple models. MTZ achieves this through layer-wise neuron sharing and weight updating schemes, which result in minimal changes to the error function. Evaluations show that MTZ effectively merges networks with minimal increase in test errors and significantly reduces the number of iterations required for retraining. It also improves the latency for switching between different tasks on memory-constrained devices.
IEEE TRANSACTIONS ON MOBILE COMPUTING
(2023)
Article
Engineering, Electrical & Electronic
Bo Wu, Kehuang Li, Fengpei Ge, Zhen Huang, Minglei Yang, Sabato Marco Siniscalchi, Chin-Hui Lee
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING
(2017)
Article
Computer Science, Information Systems
Ju Lin, Wei Li, Yingming Gao, Yanlu Xie, Nancy F. Chen, Sabato Marco Siniscalchi, Jinsong Zhang, Chin-Hui Lee
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY
(2018)
Article
Acoustics
Jun Qi, Jun Du, Sabato Marco Siniscalchi, Chin-Hui Lee
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
(2019)
Article
Acoustics
Wei Li, Nancy F. Chen, Sabato Marco Siniscalchi, Chin-Hui Lee
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
(2019)
Article
Computer Science, Artificial Intelligence
Tassadaq Hussain, Sabato Marco Siniscalchi, Hsiao-Lan Sharon Wang, Yu Tsao, Valerio Mario Salerno, Wen-Hung Liao
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS
(2020)
Article
Computer Science, Artificial Intelligence
Vincenzo Conti, Leonardo Rundo, Carmelo Militello, Valerio Mario Salerno, Salvatore Vitabile, Sabato Marco Siniscalchi
Summary: The recent developments in information technologies require robust and reliable authentication systems, leading to the proposal of a novel multimodal biometric system based on iris and retina combination. Testing different combinations of biometric databases revealed that the multimodal retina-iris biometric approach outperformed unimodal systems, showing potential as a multimodal authentication framework using multiple static biometric traits.
Article
Acoustics
Ivan Kukanov, Trung Ngo Trong, Ville Hautamaki, Sabato Marco Siniscalchi, Valerio Mario Salerno, Kong Aik Lee
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
(2020)
Article
Engineering, Electrical & Electronic
Jun Qi, Jun Du, Sabato Marco Siniscalchi, Xiaoli Ma, Chin-Hui Lee
IEEE SIGNAL PROCESSING LETTERS
(2020)
Proceedings Paper
Computer Science, Software Engineering
Tassadaq Hussain, Yu Tsao, Hsin-Min Wang, Jia-Ching Wang, Sabato Marco Siniscalchi, Wen-Hung Liao
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC)
(2019)
Proceedings Paper
Acoustics
Wei Li, Sicheng Wang, Ming Lei, Sabato Marco Siniscalchi, Chin-Hui Lee
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
(2019)
Proceedings Paper
Acoustics
Zhen Huang, Xiaodan Zhuang, Daben Liu, Xiaoqiang Xiao, Yuchen Zhang, Sabato Marco Siniscalchi
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
(2019)
Proceedings Paper
Acoustics
Wei Li, Nancy F. Chen, Sabato Marco Siniscalchi, Chin-Hui Lee
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
(2018)
Proceedings Paper
Acoustics
Sicheng Wang, Kehuang Li, Zhen Huang, Sabato Marco Siniscalchi, Chin-Hui Lee
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
(2017)
Proceedings Paper
Engineering, Electrical & Electronic
Bo Wu, Kehuang Li, Zhen Huang, Sabato Marco Siniscalchi, Minglei Yang, Chin-Hui Lee
2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017)
(2017)
Article
Engineering, Electrical & Electronic
Jun Qi, Jun Du, Sabato Marco Siniscalchi, Xiaoli Ma, Chin-Hui Lee
IEEE TRANSACTIONS ON SIGNAL PROCESSING
(2020)
Article
Computer Science, Artificial Intelligence
Rui Lv, Dingheng Wang, Jiangbin Zheng, Zhao-Xu Yang
Summary: In this paper, the authors investigate tensor decomposition for neural network compression. They analyze the convergence and precision of tensor mapping theory, validate the rationality of tensor mapping and its superiority over traditional tensor approximation based on the Lottery Ticket Hypothesis. They propose an efficient method called 3D-KCPNet to compress 3D convolutional neural networks using the Kronecker canonical polyadic (KCP) tensor decomposition. Experimental results show that 3D-KCPNet achieves higher accuracy compared to the original baseline model and the corresponding tensor approximation model.
Article
Computer Science, Artificial Intelligence
Xiangkun He, Zhongxu Hu, Haohan Yang, Chen Lv
Summary: In this paper, a novel constrained multi-objective reinforcement learning algorithm is proposed for personalized end-to-end robotic control with continuous actions. The approach trains a single model using constraint design and a comprehensive index to achieve optimal policies based on user-specified preferences.
Article
Computer Science, Artificial Intelligence
Zhijian Zhuo, Bilian Chen, Shenbao Yu, Langcai Cao
Summary: In this paper, a novel method called Expansion with Contraction Method for Overlapping Community Detection (ECOCD) is proposed, which utilizes non-negative matrix factorization to obtain disjoint communities and applies expansion and contraction processes to adjust the degree of overlap. ECOCD is applicable to various networks with different properties and achieves high-quality overlapping community detection.
Article
Computer Science, Artificial Intelligence
Yizhe Zhu, Chunhui Zhang, Jialin Gao, Xin Sun, Zihan Rui, Xi Zhou
Summary: In this work, the authors propose a Contrastive Spatio-Temporal Distilling (CSTD) approach to improve the detection of high-compressed deepfake videos. The approach leverages spatial-frequency cues and temporal-contrastive alignment to fully exploit spatiotemporal inconsistency information.
Review
Computer Science, Artificial Intelligence
Laijin Meng, Xinghao Jiang, Tanfeng Sun
Summary: This paper provides a review of coverless steganographic algorithms, including the development process, known contributions, and general issues in image and video algorithms. It also discusses the security of coverless steganography from theoretical analysis to actual investigation for the first time.
Article
Computer Science, Artificial Intelligence
Yajie Bao, Tianwei Xing, Xun Chen
Summary: Visual question answering requires processing multi-modal information and effective reasoning. Neural-symbolic learning is a promising method, but current approaches lack uncertainty handling and can only provide a single answer. To address this, we propose a confidence based neural-symbolic approach that evaluates NN inferences and conducts reasoning based on confidence.
Article
Computer Science, Artificial Intelligence
Anh H. Vo, Bao T. Nguyen
Summary: Interior style classification is an interesting problem with potential applications in both commercial and academic domains. This project proposes a method named ISC-DeIT, which combines data-efficient image transformer architectures and knowledge distillation, to address the interior style classification problem. Experimental results demonstrate a significant improvement in predictive accuracy compared to other state-of-the-art methods.
Article
Computer Science, Artificial Intelligence
Shashank Kotyan, Danilo Vasconcellos Vargas
Summary: This article introduces a novel augmentation technique called Dynamic Scanning Augmentation to improve the accuracy and robustness of Vision Transformer (ViT). The technique leverages dynamic input sequences to adaptively focus on different patches, resulting in significant changes in ViT's attention mechanism. Experimental results demonstrate that Dynamic Scanning Augmentation outperforms ViT in terms of both robustness to adversarial attacks and accuracy against natural images.
Article
Computer Science, Artificial Intelligence
Hiba Alqasir, Damien Muselet, Christophe Ducottet
Summary: The article proposes a solution to improve the learning process of a classification network by providing shape priors, reducing the need for annotated data. The solution is tested on cross-domain digit classification tasks and a video surveillance application.
Article
Computer Science, Artificial Intelligence
Dexiu Ma, Mei Liu, Mingsheng Shang
Summary: This paper proposes a method using neural dynamics solvers to solve infinity-norm optimization problems. Two improved solvers are constructed and their effectiveness and superiority are demonstrated through theoretical analysis and simulation experiments.
Article
Computer Science, Artificial Intelligence
Francesco Gregoretti, Giovanni Pezzulo, Domenico Maisto
Summary: Active Inference is a computational framework that uses probabilistic inference and variational free energy minimization to describe perception, planning, and action. cpp-AIF is a header-only C++ library that provides a powerful tool for implementing Active Inference for Partially Observable Markov Decision Processes through multi-core computing. It is cross-platform and improves performance, memory management, and usability compared to existing software.
Article
Computer Science, Artificial Intelligence
Zelin Ying, Dawei Cheng, Cen Chen, Xiang Li, Peng Zhu, Yifeng Luo, Yuqi Liang
Summary: This paper proposes a novel stock market trends prediction framework called SMART, which includes a self-supervised stock technical data sequence embedding model S3E. By training with multiple self-supervised auxiliary tasks, the model encodes stock technical data sequences into embeddings and uses the learned sequence embeddings for predicting stock market trends. Extensive experiments on China A-Shares market and NASDAQ market prove the high effectiveness of our model in stock market trends prediction, and its effectiveness is further validated in real-world applications in a leading financial service provider in China.
Article
Computer Science, Artificial Intelligence
Hao Li, Hao Jiang, Dongsheng Ye, Qiang Wang, Liang Du, Yuanyuan Zeng, Liu Yuan, Yingxue Wang, C. Chen
Summary: DHGAT1, a dynamic hyperbolic graph attention network, utilizes hyperbolic metric properties to embed dynamic graphs. It employs a spatiotemporal self-attention mechanism and weighted node representations, resulting in excellent performance in link prediction tasks.
Article
Computer Science, Artificial Intelligence
Jiehui Huang, Zhenchao Tang, Xuedong He, Jun Zhou, Defeng Zhou, Calvin Yu-Chian Chen
Summary: This study proposes a progressive learning multi-scale feature blending model for image deraining tasks. The model utilizes detail dilation and texture extraction to improve the restoration of rainy images. Experimental results show that the model achieves near state-of-the-art performance in rain removal tasks and exhibits better rain removal realism.
Article
Computer Science, Artificial Intelligence
Lizhi Liu, Zilin Gao, Yinhe Wang, Yongfu Li
Summary: This paper proposes a novel discrete-time interconnected model for depicting complex dynamical networks. The model consists of nodes and edges subsystems, which consider the dynamic characteristic of both nodes and edges. By designing control strategies and coupling modes, the stabilization and synchronization of the network are achieved. Simulation results demonstrate the effectiveness of the proposed methods.