☆ 4.2 Article Proceedings Paper

Knowledge transfer in deep convolutional neural nets

INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS (2008)

期刊

INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS

卷 17, 期 3, 页码 555-567

出版社

WORLD SCIENTIFIC PUBL CO PTE LTD

DOI: 10.1142/S0218213008004059

关键词

knowledge transfer; deep neural nets; inductive transfer; neural nets

类别

Computer Science, Artificial Intelligence Computer Science, Interdisciplinary Applications

向作者/读者索取更多资源

Protocol

Reagent

摘要

Knowledge transfer is widely held to be a primary mechanism that enables humans to quickly learn new complex concepts when given only small training sets. In this paper, we apply knowledge transfer to deep convolutional neural nets, which we argue are particularly well suited for knowledge transfer. Our initial results demonstrate that components of a trained deep convolutional neural net can constructively transfer information to another such net. Furthermore, this transfer is completed in such a way that one can envision creating a net that could learn new concepts throughout its lifetime. The experiments we performed involved training a Deep Convolutional Neural Net (DCNN) on a large training set containing 20 different classes of handwritten characters from the NIST Special Database 19. This net was then used as a foundation for training a new net on a set of 20 different character classes from the NIST Special Database 19. The new net would keep the bottom layers of the old net (i.e. those nearest to the input) and only allow the top layers to train on the new character classes. We purposely used small training sets for the new net to force it to rely as much as possible upon transferred knowledge as opposed to a large and varied training set to learn the new set of hand written characters. Our results show a clear advantage in relying upon transferred knowledge to learn new tasks when given small training sets, if the new tasks are sufficiently similar to the previously mastered one. However, this advantage decreases as training sets increase in size.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Computer Science, Artificial Intelligence

Knowledge Distillation: A Survey

Jianping Gou, Baosheng Yu, Stephen J. Maybank, Dacheng Tao

Summary: This paper provides a comprehensive survey of knowledge distillation, covering knowledge categories, training schemes, teacher-student architecture, distillation algorithms, performance comparison, and applications. It also briefly reviews challenges in knowledge distillation and discusses future research directions.

INTERNATIONAL JOURNAL OF COMPUTER VISION (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Fast Filter Pruning via Coarse-to-Fine Neural Architecture Search and Contrastive Knowledge Transfer

Seunghyun Lee, Byung Cheol Song

Summary: Filter pruning is a representative technique for lightweighting CNNs. To increase the usability of CNNs, filter pruning itself needs to be lightweighted. Thus, a coarse-to-fine NAS algorithm and a fine-tuning structure based on CKT are proposed.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Knowledge Transfer via Decomposing Essential Information in Convolutional Neural Networks

Seunghyun Lee, Byung Cheol Song

Summary: Knowledge distillation is a method to improve the performance of a student network by transferring knowledge from a teacher network. The proposed method transfers knowledge independently of the spatial shape of the teacher's feature map using singular value decomposition. Additionally, a multitask learning method is presented to effectively adjust the teacher's constraints to the student's learning speed. Experimental results show significant improvements on different datasets.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

添加到收藏夹

Article Energy & Fuels

Multitasking recurrent neural network for photovoltaic power generation prediction

Hui Song, Nameer Al Khafaf, Ammar Kamoona, Samaneh Sadat Sajjadi, Ali Moradi Amani, Mahdi Jalili, Xinghuo Yu, Peter McTaggart

Summary: With the increasing importance of renewable energy, predicting photovoltaic (PV) power generation becomes crucial for power management and optimization. This paper proposes a multitasking prediction approach using recurrent neural networks (RNNs) to improve the accuracy of PV power generation prediction across different customer categories. The proposed multitasking RNN (MT-RNN) framework transfers knowledge among tasks, achieving superior performance compared to individual deep neural network (DNN) models.

ENERGY REPORTS (2023)

添加到收藏夹

Article Computer Science, Information Systems

Customizing a teacher for feature distillation

Chao Tan, Jie Liu

Summary: Knowledge distillation is a method to train a lightweight network by transferring class probability knowledge from a cumbersome teacher network. Several approaches have been proposed to transfer the teacher's knowledge at the feature map level.

INFORMATION SCIENCES (2023)

添加到收藏夹

Article Computer Science, Information Systems

Online knowledge distillation with elastic peer

Chao Tan, Jie Liu

Summary: Knowledge distillation is effective for transferring knowledge, but the existing training strategy for online knowledge distillation may limit diversity among peer networks. A new strategy called KDEP is introduced to address this issue and improve the overall performance of online knowledge distillation.

INFORMATION SCIENCES (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Scalability of knowledge distillation in incremental deep learning for fast object detection

Elizabeth Irenne Yuwono, Dian Tjondonegoro, Golam Sorwar, Alireza Alaei

Summary: This paper investigates the scalability of incremental deep learning for visual recognition, specifically for fast object detection. The experimental results show that incremental learning with knowledge transfer and distillation can save storage requirements compared to training-at-once, but it increases computational time. Adjusting key parameters plays an important role in balancing the accuracy of new and old classes and reducing computational cost.

APPLIED SOFT COMPUTING (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Improving knowledge distillation via an expressive teacher

Chao Tan, Jie Liu, Xiang Zhang

Summary: Knowledge distillation is a network compression technique where a teacher network guides a student network to mimic its behavior. This study explores how to train as a good teacher, proposing inter-class correlation regularization. Experimental results show that this method achieves good performance in image classification tasks.

KNOWLEDGE-BASED SYSTEMS (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

An imitation from observation approach for dozing distance learning in autonomous bulldozer operation

Ke You, Lieyun Ding, Quanli Dou, Yutian Jiang, Zhangang Wu, Cheng Zhou

Summary: Bulldozers are crucial in earthwork construction, and improving their intelligence is significant for the industry. This study proposes a hybrid method that imitates expert knowledge using modified deep convolutional neural networks and observation dataset. The method successfully solves the observation-based expert knowledge imitation problem.

ADVANCED ENGINEERING INFORMATICS (2022)

添加到收藏夹

Article Multidisciplinary Sciences

Quantum deep learning by sampling neural nets with a quantum annealer

Catherine F. Higham, Adrian Bedford

Summary: We demonstrate the feasibility of using a classically learned deep neural network as an energy based model on a quantum annealer to exploit fast sampling times. We propose solutions for the challenges of high resolution image classification on a quantum processing unit (QPU): the required number of model states and the binary nature of these states. By transferring a pretrained convolutional neural network to the QPU, we show the potential for classification speedup of at least one order of magnitude.

SCIENTIFIC REPORTS (2023)

添加到收藏夹

Article Biology

MVKT-ECG: Efficient single-lead ECG classification for multi-label arrhythmia by multi-view knowledge transferring

Yuzhen Qin, Li Sun, Hui Chen, Wenming Yang, Wei-Qiang Zhang, Jintao Fei, Guijin Wang

Summary: The aim of this study is to improve the diagnostic capabilities of single-lead ECG for multi-label disease classification by transferring disease knowledge from multi-lead ECG to a single-lead ECG interpretation model using a teacher-student approach. The study presents a new method called Contrastive Lead-information Transferring (CLT) and modifies Knowledge Distillation into Multi-label disease Knowledge Distillation (MKD) to facilitate the transfer of disease information between different views of ECG. The experiments demonstrate significant improvements in diagnostic performance for single-lead ECG.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

添加到收藏夹

Article Business

Facilitating innovation and knowledge transfer between homogeneous and heterogeneous datasets: Generic incremental transfer learning approach and multidisciplinary studies

Kwok Tai Chui, Varsha Arya, Shahab S. Band, Mobeen Alhalabi, Ryan Wen Liu, Hao Ran Chi

Summary: Open datasets provide researchers with authentic data for conducting research. Transfer learning algorithms enable the extraction of innovation and knowledge from homogeneous datasets of different domains, facilitating the use of machine learning models. This study proposes a multiple incremental transfer learning approach to achieve optimal results in the target model.

JOURNAL OF INNOVATION & KNOWLEDGE (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Improving Knowledge Distillation With a Customized Teacher

Chao Tan, Jie Liu

Summary: Knowledge distillation (KD) is a method to transfer knowledge from a complex network to a lightweight network, by selecting teachers based on the standard deviation of secondary soft probabilities, and using pretraining under dual supervision and an asymmetrical transformation function to enhance the dispersion of teachers' secondary soft probabilities.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

添加到收藏夹

Article Computer Science, Information Systems

An Efficient and Robust Cloud-Based Deep Learning With Knowledge Distillation

Zeyi Tao, Qi Xia, Songqing Cheng, Qun Li

Summary: In recent years, deep neural networks have excelled in practical learning tasks, but deploying them on resource-limited devices is challenging. Knowledge distillation transfers model knowledge from a well-trained model to a smaller one, reducing computational cost. A novel neuron manifold distillation method is proposed to improve accuracy-speed trade-offs and a confident prediction mechanism is introduced to enhance the reliability of cloud-based learning systems.

IEEE TRANSACTIONS ON CLOUD COMPUTING (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Efficient probability intervals for classification using inductive venn predictors

Dimitrios Boursinos, Xenofon Koutsoukos

Summary: This paper presents a method for computing probability intervals in real-time, which assigns pseudo-labels to unlabeled input data to improve efficiency. Empirical evaluation shows that the proposed method improves accuracy and calibration in image classification and botnet attack detection in IoT applications.

PATTERN RECOGNITION (2023)

添加到收藏夹

暂无数据

暂无数据

© Peeref 2019-2024. All rights reserved.