期刊
PATTERN RECOGNITION
卷 111, 期 -, 页码 -出版社
ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2020.107659
关键词
Knowledge distillation; Data augmentation; Generative adversarial nets; Divergent examples; Image classification
资金
- National Key Research and Development Program of China [2018YFB0804205]
- NSFC [61802104, 61932009, 61725203, 61732008]
The paper introduces a novel approach named ACNs, which enhances dark knowledge by generating extra divergent examples, and the extensive experiments demonstrate the effectiveness of this method.
Knowledge distillation is an effective way to transfer the knowledge from a pre-trained teacher model to a student model. Co-distillation, as an online variant of distillation, further accelerates the training process and paves a new way to explore the dark knowledge by training n models in parallel. In this paper, we explore the divergent examples, which can make the classifiers have different predictions and thus induce the dark knowledge, and we propose a novel approach named Adversarial Co-distillation Networks (ACNs) to enhance the dark knowledge by generating extra divergent examples. Note that we do not involve any extra dataset, and we only utilize the standard training set to train the entire framework. ACNs are end-to-end frameworks composed of two parts: an adversarial phase consisting of Generative Adversarial Networks (GANs) to generate the divergent examples and a co-distillation phase consisting of multiple classifiers to learn the divergent examples. These two phases are learned in an iterative and adversarial way. To guarantee the quality of the divergent examples and the stability of ACNs, we further design Weakly Residual Connection module and Restricted Adversarial Search module to assist in the training process. Extensive experiments with various deep architectures on different datasets well demonstrate the effectiveness of our approach. (C) 2020 Elsevier Ltd. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据