Article
Computer Science, Artificial Intelligence
Hao Pan, Jun Huang
Summary: This paper proposes a novel semantic-enhanced discriminative embedding learning method to improve the discriminative ability of cross-modal retrieval models. The method consists of three modules: attention-guided erasing, large-scale negative sampling, and weighted InfoNCE loss. Experimental results demonstrate the effectiveness of integrating these modules into existing models.
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL
(2022)
Article
Engineering, Civil
Rushi Lan, Yu Tan, Xiaoqin Wang, Zhenbing Liu, Xiaonan Luo
Summary: This paper proposes a novel supervised hashing method, LGDH, which simultaneously preserves the comprehensive manifold structure and discriminative balanced codes in the Hamming space. By utilizing local category distribution and label-guided matrix construction, LGDH improves the discriminative power and balance of hash codes. Extensive experiments show that LGDH outperforms other methods in cross-modal tasks.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Ge Song, Xiaoyang Tan, Jun Zhao, Ming Yang
Summary: RMSH is designed for more accurate multi-label cross-modal retrieval, addressing modality discrepancies and noise through fine-grained similarity of rich semantics and robust margin-adaptive triplet loss. The effective bounds derived from information coding-theoretic analysis enable our method to achieve state-of-the-art performance on multiple benchmarks.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Kranti Kumar Parida, Gaurav Sharma
Summary: This paper addresses the problem of learning a shared representation space for cross-modal retrieval. The authors propose the use of Discriminative Semantic Transitive Consistency to ensure correct classifications even after transferring the data points to another modality. They also enforce a traditional distance minimizing constraint and incorporate semantic cycle-consistency. The authors demonstrate better performance through empirical studies and provide qualitative results to support their proposals.
COMPUTER VISION AND IMAGE UNDERSTANDING
(2022)
Article
Computer Science, Artificial Intelligence
Xitao Zou, Song Wu, Erwin M. Bakker, Xinzhi Wang
Summary: In this paper, a novel multi-label enhancement based self-supervised deep cross-modal hashing approach is proposed to capture semantic affinity more accurately and avoid noise in modalities, achieving state-of-the-art performance in cross-modal hashing retrieval applications.
Article
Computer Science, Artificial Intelligence
XianHua Zeng, Ke Xu, YiCai Xie
Summary: With the rapid development of big data and the Internet, cross-modal retrieval has become a popular research topic. Cross-modal hashing is an important research direction in cross-modal retrieval, and recent unsupervised methods have achieved great results. However, narrowing the heterogeneous gap between different modalities and generating more discriminative hash codes remain the main challenges. In this paper, we propose a novel unsupervised cross-modal hashing method called Pseudo-label Driven Deep Hashing to address these challenges. Experimental results demonstrate the superiority of our method compared to several unsupervised cross-modal hashing methods.
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS
(2023)
Article
Engineering, Electrical & Electronic
Chunpu Sun, Huaxiang Zhang, Li Liu, Dongmei Liu, Lin Wang
Summary: In this paper, a novel Multi-label Adversarial Fine-grained Cross-modal Retrieval Based on Transformer (MLAT) method is proposed to bridge the semantic gap and eliminate modal specific features. The method constructs a semantic consistency enhanced module and a multi-stage adversarial learning module to optimize feature representations.
SIGNAL PROCESSING-IMAGE COMMUNICATION
(2023)
Article
Engineering, Electrical & Electronic
Xitao Zou, Xinzhi Wang, Erwin M. Bakker, Song Wu
Summary: This paper introduces a deep cross-modal hashing method based on multi-label semantics preservation, aiming to improve the accuracy of hashing retrieval by leveraging multiple labels of training data. Experimental results demonstrate that the proposed method outperforms prominent baselines and achieves state-of-the-art performance in cross-modal hashing retrieval.
SIGNAL PROCESSING-IMAGE COMMUNICATION
(2021)
Article
Computer Science, Artificial Intelligence
Xiaohan Yang, Zhen Wang, Wenhao Liu, Xinyi Chang, Nannan Wu
Summary: In recent years, researchers have been using hashing algorithms to improve the efficiency of large-scale cross-modal retrieval by mapping features into binary codes. However, existing cross-modal hashing algorithms often overlook the multi-label information by focusing only on single class labels. To address this issue, we propose DAMCH, a deep adversarial multi-label cross-modal hashing algorithm that considers both multi-label and deep features. Our algorithm preserves the Hamming neighbor relationship and ensures the same semantic information in binary features as in the original label. Additionally, our algorithm minimizes information loss during feature mapping and ensures consistent feature distribution across modalities. Experimental results show that DAMCH outperforms state-of-the-art methods.
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL
(2023)
Article
Computer Science, Artificial Intelligence
Bo Liu, Zhiyong Che, Kejian Song, Yanshan Xiao
Summary: The paper introduces a new method called ADML for multi-label learning, which combines analytical discrimination dictionary learning and sparse representation to achieve success in multi-label classification.
APPLIED INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Peng Hu, Xi Peng, Hongyuan Zhu, Jie Lin, Liangli Zhen, Wei Wang, Dezhong Peng
Summary: Cross-modal retrieval is a challenging task that aims to bridge the heterogeneous gap between different modalities. To address this challenge, researchers proposed a novel method called Cross-modal Discriminant Adversarial Network (CAN) which includes specific generators, discriminators, and a Cross-modal Discriminant Mechanism (CDM).
PATTERN RECOGNITION
(2021)
Article
Computer Science, Information Systems
Yiling Wu, Shuhui Wang, Guoli Song, Qingming Huang
Summary: This paper proposes a cross-modal retrieval method that employs augmented adversarial training to align data from different modalities by incorporating more semantic relevant and irrelevant sample pairs, which improves the alignment effectiveness. Extensive experiments demonstrate the promising power of the approach compared with state-of-the-art methods.
IEEE TRANSACTIONS ON MULTIMEDIA
(2021)
Article
Computer Science, Artificial Intelligence
Jun Long, Longzhi Sun, Lin Guo, Liujie Hua, Zhan Yang
Summary: Hashing technologies are widely used for efficient retrieval and storage in information retrieval tasks. However, most current supervised learning methods only utilize labels to construct a binary similarity matrix, ignoring the rich semantic information contained in the labels. This paper proposes a flexible two-step label embedding hashing method called LESGH, which effectively leverages label information and addresses the time consumption and scalability issues for large-scale data.
Article
Computer Science, Information Systems
Yicai Xie, Xianhua Zeng, Tinghua Wang, Yun Yi
Summary: A novel method called ODHUC is proposed for both uni-modal and cross-modal retrieval. It adopts an online deep hashing approach to continuously learn hash codes by sampling and updating the model. ODHUC also avoids forgetting old knowledge through knowledge distillation. Experimental results demonstrate that ODHUC outperforms other methods.
INFORMATION SCIENCES
(2022)
Article
Computer Science, Artificial Intelligence
Donglin Zhang, Xiao-Jun Wu, Jun Yu
Summary: A novel cross-modal hashing approach, DSPH, is proposed in this study, which generates more discriminative hash codes by considering intra- and inter-modality structure preserving, as well as improving local geometric consistency. Extensive experimental results demonstrate that the proposed algorithm outperforms several state-of-art cross-media retrieval methods.
PATTERN ANALYSIS AND APPLICATIONS
(2021)
Article
Computer Science, Artificial Intelligence
Kaihao Zhang, Dongxu Li, Wenhan Luo, Wenqi Ren, Wei Liu
Summary: Video deraining is an important task in computer vision. In this paper, we propose a new end-to-end video deraining framework called ESTINet, which utilizes deep residual networks and convolutional long short-term memory to enhance the quality and speed of video deraining.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Xue Yang, Junchi Yan, Wenlong Liao, Xiaokang Yang, Jin Tang, Tao He
Summary: In this paper, we introduce the idea of denoising to object detection, enhancing the ability to detect small and cluttered objects. We also address the boundary problem caused by rotation variation by adding an IoU constant factor to the smooth L1 loss. Combining these features, our proposed detector, SCRDet++, shows effectiveness in extensive experiments on various datasets.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Engineering, Electrical & Electronic
Han Huang, Li Shen, Chaoyang He, Weisheng Dong, Wei Liu
Summary: This paper proposes a novel differentiable Neural Architecture Search (NAS) approach for searching lightweight single image super-resolution models. Experimental results show that the proposed method achieves state-of-the-art performance in terms of PSNR, SSIM, and model complexity.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2023)
Editorial Material
Computer Science, Artificial Intelligence
Junchi Yan, Hongteng Xu, Liangda Li, Mehrdad Farajtab, Xiaokang Yang
Summary: This article discusses two common forms of temporal data: synchronized temporal data and asynchronous event data. Previous approaches often convert event data into time series data, but it is more meaningful to directly establish models based on raw event data, especially for time-sensitive tasks. The theme of this article is the development of spatio-temporal point processes and their related applications, which treat an event as a point in the spatio-temporal space and capture the instantaneous happening rate of events and their potential dependency. Use cases include future event prediction and causality estimation.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Engineering, Electrical & Electronic
Rong-Cheng Tu, Jie Jiang, Qinghong Lin, Chengfei Cai, Shangxuan Tian, Hongfa Wang, Wei Liu
Summary: In this paper, the authors propose a novel unsupervised cross-modal hashing method (UCHM) that utilizes a modality-interaction-enabled similarity generator and a bit-selection module to improve the retrieval performance of unlabeled cross-modal data.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
(2023)
Article
Computer Science, Artificial Intelligence
Xiaolong Yang, Xiaohong Jia, Dihong Gong, Dong-Ming Yan, Zhifeng Li, Wei Liu
Summary: This paper proposes a completely integrated embedded end-to-end Lie algebra residual architecture (LARNeXt) for achieving pose robust face recognition. By exploring the impact of face rotation on the deep feature generation process of convolutional neural networks (CNNs), the authors design critical subnets to estimate pose efficiently and control the strength of the residual component. Extensive experimental evaluations demonstrate the superiority of their method over state-of-the-art ones.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Zhongpai Gao, Junchi Yan, Guangtao Zhai, Juyong Zhang, Xiaokang Yang
Summary: This article introduces a local structure-aware anisotropic convolutional operation (LSA-Conv) for 3-D shapes, which learns adaptive weighting matrices for each template's node and performs shared anisotropic filters. Comprehensive experiments demonstrate that the model achieves significant improvement in 3-D shape reconstruction.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Tianqi Zhang, Qitian Wu, Junchi Yan
Summary: In this article, the authors study the reason why high-order graph convolution schemes have the ability to learn structure-aware representations. They propose a new adaptive feature combination method inspired by the squeeze-and-excitation module to effectively capture the interdistance relationship and achieve significant performance gain.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Automation & Control Systems
Wenxuan Guo, Hui-Ling Zhen, Xijun Li, Wanqian Luo, Mingxuan Yuan, Yaohui Jin, Junchi Yan
Summary: This paper reviews recent literature on using machine learning techniques to solve the Boolean satisfiability problem (SAT), a classic NP-complete problem. The rapid advancements in the field of machine learning over the past decade have inspired many researchers to apply machine learning methods for SAT solving. The paper examines the evolution of ML SAT solvers, from naive classifiers with handcrafted features to emerging end-to-end SAT solvers, as well as recent progress in combining existing conflict-driven clause learning (CDCL) and local search solvers with machine learning methods. The paper concludes by addressing the limitations of current works and suggesting possible future directions. The collected paper list is available at https://github.com/ThinklabSJTU/awesome-ml4co.
MACHINE INTELLIGENCE RESEARCH
(2023)
Article
Computer Science, Artificial Intelligence
Longyuan Li, Junchi Yan, Yunhao Zhang, Jihai Zhang, Jie Bao, Yaohui Jin, Xiaokang Yang
Summary: This paper proposes the RNN-ODE collaborative model for joint modeling and forecasting of heterogeneous time-series and event sequence data, combining Bayesian and deep learning techniques for interpretability. Experimental results demonstrate the competitive forecasting performance of both time-series and event sequences compared to existing methods.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Article
Computer Science, Artificial Intelligence
Li Shen, Congliang Chen, Fangyu Zou, Zequn Jie, Ju Sun, Wei Liu
Summary: Integrating adaptive learning rate and momentum techniques into stochastic gradient descent has led to various efficient adaptive stochastic algorithms such as AdaGrad, RMSProp, Adam, and AccAdaGrad. This paper proposes a weighted AdaGrad algorithm called AdaUSM that incorporates a unified momentum scheme and a novel weighted adaptive learning rate. It shows that AdaUSM achieves $\mathcal{O}(\log(T)/\sqrt{T})$ convergence rate in the nonconvex stochastic setting with polynomially growing weights. Furthermore, it provides a new perspective for understanding Adam and RMSProp by showing that their adaptive learning rates correspond to exponentially growing weights in AdaUSM. Comparative experiments on deep learning models and datasets are also conducted.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Haiwen Diao, Ying Zhang, Wei Liu, Xiang Ruan, Huchuan Lu
Summary: Researchers have developed two efficient regulators, namely the Recurrent Correspondence Regulator (RCR) and the Recurrent Aggregation Regulator (RAR), which enhance the flexibility of correspondence and emphasize important alignments in image-text matching. These regulators can be easily incorporated into existing frameworks and have shown significant improvements in various models.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2023)
Article
Computer Science, Artificial Intelligence
Xue Yang, Gefan Zhang, Xiaojiang Yang, Yue Zhou, Wentao Wang, Jin Tang, Tao He, Junchi Yan
Summary: Existing detection methods using parameterized bounding box and rotation angle have limitations for high-precision rotation detection. We propose modeling rotated objects as Gaussian distributions, with regression loss based on Kullback-Leibler Divergence to align detection performance. Our approach resolves boundary discontinuity and square-like problems, and uses an efficient Gaussian metric-based label assignment strategy for improved performance. Analysis of the BBox parameters' gradients shows their interpretable physical meaning, explaining the effectiveness of our approach. Extension to 3D with tailored algorithm design further enhances the performance, as demonstrated on various datasets.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Hanchi Huang, Li Shen, Deheng Ye, Wei Liu
Summary: A novel master-slave architecture is proposed to solve the top-K combinatorial multiarmed bandits problem with nonlinear bandit feedback and diversity constraints. It significantly outperforms existing algorithms in recommendation tasks.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Zhe Xu, Kun Wei, Erkun Yang, Cheng Deng, Wei Liu
Summary: This paper proposes a method called Bilateral Relation Distillation (BRD) to address the problem of weakly supervised temporal action localization. The method learns representations by jointly modeling category-level and sequence-level relations, and captures category-level relations through correlation alignment and category-aware contrast. Additionally, it utilizes a gradient-based feature augmentation method to model relations among segments at the sequence-level.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)