4.8 Article

Sharable and Individual Multi-View Metric Learning

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2017.2749576

Keywords

Metric learning; deep learning; multi-view learning; face verification; kinship verification; person re-identification

Funding

  1. National Natural Science Foundation of China [61672306]

Ask authors/readers for more resources

This paper presents a sharable and individual multi-view metric learning (MvML) approach for visual recognition. Unlike conventional metric leaning methods which learn a distance metric on either a single type of feature representation or a concatenated representation of multiple types of features, the proposed MvML jointly learns an optimal combination of multiple distance metrics on multi-view representations, where not only it learns an individual distance metric for each view to retain its specific property but also a shared representation for different views in a unified latent subspace to preserve the common properties. The objective function of the MvML is formulated in the large margin learning framework via pairwise constraints, under which the distance of each similar pair is smaller than that of each dissimilar pair by a margin. Moreover, to exploit the nonlinear structure of data points, we extend MvML to a sharable and individual multi-view deep metric learning (MvDML) method by utilizing the neural network architecture to seek multiple nonlinear transformations. Experimental results on face verification, kinship verification, and person re-identification show the effectiveness of the proposed sharable and individual multi-view metric learning methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Artificial Intelligence

Egocentric Action Recognition by Automatic Relation Modeling

Haoxin Li, Wei-Shi Zheng, Jianguo Zhang, Haifeng Hu, Jiwen Lu, Jian-Huang Lai

Summary: This study proposes a weakly supervised model for egocentric action recognition, which automatically localizes interactors and establishes explicit relation models for recognition without using annotations or prior knowledge. Extensive experiments on egocentric video datasets demonstrate the effectiveness of the proposed method.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Deep Metric Learning With Adaptively Composite Dynamic Constraints

Wenzhao Zheng, Jiwen Lu, Jie Zhou

Summary: This paper proposes a deep metric learning method called DML-DC, which utilizes adaptively generated dynamic constraints for image retrieval and clustering. The method employs a learnable constraint generator to produce dynamic constraints and trains the metric towards better generalization. It formulates the deep metric learning objective under a proxy collection, pair sampling, tuple construction, and tuple weighting paradigm.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Learning Deep Binary Descriptors via Bitwise Interaction Mining

Ziwei Wang, Han Xiao, Yueqi Duan, Jie Zhou, Jiwen Lu

Summary: In this paper, we propose a GraphBit method for learning unsupervised deep binary descriptors to efficiently represent images. The method reduces the uncertainty of binary codes by maximizing the mutual information with input and related bits, allowing reliable binarization of ambiguous bits. Additionally, a differentiable search method called GraphBit+ is introduced to mine bitwise interaction in continuous space, reducing the computational cost of reinforcement learning. To address the issue of inaccurate instructions from fixed bitwise interaction, the unsupervised binary descriptor learning method D-GraphBit is proposed, which utilizes a graph convolutional network to reason the optimal bitwise interaction for each input sample.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Robotics

Planning Irregular Object Packing via Hierarchical Reinforcement Learning

Sichao Huang, Ziwei Wang, Jie Zhou, Jiwen Lu

Summary: Object packing by autonomous robots is a significant challenge in warehouses and logistics industry. This paper proposes a deep hierarchical reinforcement learning approach to simultaneously plan the packing sequence and placement for irregular objects. The approach utilizes two networks, a top manager network to infer the packing sequence and a bottom worker network to predict the placement position and orientation, which are trained hierarchically in a self-supervised Q-Learning framework.

IEEE ROBOTICS AND AUTOMATION LETTERS (2023)

Article Computer Science, Artificial Intelligence

Content-Aware Warping for View Synthesis

Mantang Guo, Junhui Hou, Jing Jin, Hui Liu, Huanqiang Zeng, Jiwen Lu

Summary: This paper proposes a content-aware warping method that adaptsively learns the interpolation weights for pixels from their contextual information via a lightweight neural network. Based on this learnable warping module, a new end-to-end learning-based framework is proposed for novel view synthesis, which includes two additional modules to address occlusion and spatial correlation issues. Experimental results demonstrate that the proposed method outperforms state-of-the-art methods both quantitatively and visually.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Depth-Guided Optimization of Neural Radiance Fields for Indoor Multi-View Stereo

Yi Wei, Shaohui Liu, Jie Zhou, Jiwen Lu

Summary: In this work, a new multi-view depth estimation method called NerfingMVS is presented, which combines conventional reconstruction and learning-based priors with neural radiance fields (NeRF). It directly optimizes over implicit volumes, eliminating the need for pixel matching in indoor scenes. The key is using learning-based priors to guide the optimization process of NeRF. The proposed method achieves state-of-the-art performances and improves rendering quality on both seen and novel views.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

GFNet: Global Filter Networks for Visual Recognition

Yongming Rao, Wenliang Zhao, Zheng Zhu, Jie Zhou, Jiwen Lu

Summary: We present GFNet, a conceptually simple yet computationally efficient architecture that learns long-term spatial dependencies in the frequency domain. GFNet outperforms Transformer-based models and CNNs in terms of efficiency, generalization ability, and robustness. We provide a series of isotropic and hierarchical models based on GFNet design.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

LRRNet: A Novel Representation Learning Guided Fusion Network for Infrared and Visible Images

Hui Li, Tianyang Xu, Xiao-Jun Wu, Jiwen Lu, Josef Kittler

Summary: Deep learning based fusion methods have achieved promising performance in image fusion tasks due to the importance of network architecture. However, designing fusion networks is still a challenging task. In this paper, the fusion task is mathematically formulated and a connection between the optimal solution and network architecture is established. This leads to the proposal of a lightweight fusion network based on a learnable representation approach.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks

Yongming Rao, Zuyan Liu, Wenliang Zhao, Jie Zhou, Jiwen Lu

Summary: In this paper, a new approach for model acceleration by exploiting spatial sparsity in visual data is presented. A dynamic token sparsification framework is proposed, which prunes redundant tokens progressively and dynamically based on the input to accelerate vision Transformers. The framework extends to hierarchical models and more complex dense prediction tasks, offering a new and more effective dimension for model acceleration. Promising results are achieved on various architectures and visual tasks, demonstrating the effectiveness of the proposed framework.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Automation & Control Systems

Adaptively Weighted k-Tuple Metric Network for Kinship Verification

Sheng Huang, Jingkai Lin, Luwen Huangfu, Yun Xing, Junlin Hu, Daniel Dajun Zeng

Summary: Facial image-based kinship verification is a rapidly growing field in computer vision and biometrics. This study proposes a novel deep learning model called AWk-TMN that leverages high-order cross-pair features to enhance the performance of kinship verification.

IEEE TRANSACTIONS ON CYBERNETICS (2023)

Article Computer Science, Artificial Intelligence

Discrepancy-Aware Meta-Learning for Zero-Shot Face Manipulation Detection

Bingyao Yu, Xiu Li, Wanhua Li, Jie Zhou, Jiwen Lu

Summary: In this paper, a discrepancy-aware meta-learning approach for zero-shot face manipulation detection is proposed. The approach aims to learn a discriminative model that maximizes generalization to unseen face manipulation attacks with the guidance of the discrepancy map. Unlike existing methods, the detection of face manipulation is defined as a zero-shot problem, where algorithmic solutions are presented for known face manipulation attacks. The learning process is formulated as meta-learning and zero-shot face manipulation tasks are generated to learn diversified attack meta-knowledge. The discrepancy map is utilized to focus the model on generalized optimization directions during meta-learning, and a center loss is incorporated to better guide the model in exploring more effective meta-knowledge. Experimental results on widely used face manipulation datasets demonstrate competitive performance under the zero-shot setting.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2023)

Article Computer Science, Information Systems

Seeing Through Darkness: Visual Localization at Night via Weakly Supervised Learning of Domain Invariant Features

Bin Fan, Yuzhu Yang, Wensen Feng, Fuchao Wu, Jiwen Lu, Hongmin Liu

Summary: This paper proposes an adversarial learning based solution to extract robust local features and descriptions across day-night images. By training a discriminator to distinguish day and night images and adjusting the feature extraction network to fool the discriminator, the network can extract domain invariant keypoints and descriptors. Compared to existing methods, this approach only requires additional easily captured night images to improve the domain invariance of learned features.

IEEE TRANSACTIONS ON MULTIMEDIA (2023)

Article Computer Science, Artificial Intelligence

Quantformer: Learning Extremely Low-Precision Vision Transformers

Ziwei Wang, Changyuan Wang, Xiuwei Xu, Jie Zhou, Jiwen Lu

Summary: In this article, the authors propose Quantformer, a type of extremely low-precision vision transformers for efficient inference. They address the limitations of conventional network quantization methods by considering the properties of transformer architectures and implementing capacity-aware distribution and group-wise discretization strategies. Experimental results show that Quantformer outperforms state-of-the-art methods in image classification and object detection across various vision transformer architectures. The authors also integrate Quantformer with mixed-precision quantization to further enhance performance.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Theory & Methods

Estimating Fingerprint Pose via Dense Voting

Yongjie Duan, Jianjiang Feng, Jiwen Lu, Jie Zhou

Summary: In this study, a fusion of voting strategy and deep network is proposed to estimate fingerprint center and direction. Experimental results show that this approach can achieve consistent fingerprint pose estimations, improve performance of fingerprint indexing and verification, and be robust to different sensing technologies and impression types.

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY (2023)

Article Computer Science, Artificial Intelligence

STAR-FC: Structure-Aware Face Clustering on Ultra-Large-Scale Graphs

Shuai Shen, Wanhua Li, Zheng Zhu, Jie Zhou, Jiwen Lu

Summary: This paper proposes a new face clustering method, called STructure-AwaRe Face Clustering (STAR-FC), which addresses the dilemma of large-scale training and efficient inference by designing a structure-preserving subgraph sampling strategy and a novel hierarchical GCN training paradigm. During inference, the STAR-FC performs efficient full-graph clustering with two steps: graph parsing and graph refinement, and introduces the concept of node intimacy to mine the local structural information. The experimental results demonstrate that this method achieves superior performance and efficiency.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

No Data Available