期刊
NEURAL NETWORKS
卷 148, 期 -, 页码 155-165出版社
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2022.01.012
关键词
ResNets; DenseNets; Shallow subnetwork first; Low-degree term first; Taylor expansion
资金
- National Natural Science Foundation of China [61976216, 61672522]
This paper proposes a novel argument of shallow subnetwork first (SSF) which explains the working mechanism of ResNet and its variants. The experiments show that shallow subnetworks are trained firstly and play important roles in the neural networks' performance. It also reveals the reason why DenseNets outperform ResNets is due to the shallower subnetworks playing vital roles.
To explain the working mechanism of ResNet and its variants, this paper proposes a novel argument of shallow subnetwork first (SSF), essentially low-degree term first (LDTF), which also applies to the whole neural network family. A neural network with shortcut connections behaves as an ensemble of a number of subnetworks of differing depths. Among the subnetworks, the shallow subnetworks are trained firstly, having great effects on the performance of the neural network. The shallow subnetworks roughly correspond to low-degree polynomials, while the deep subnetworks are opposite. Based on Taylor expansion, SSF is consistent with LDTF. ResNet is in line with Taylor expansion: shallow subnetworks are trained firstly to keep low-degree terms, avoiding overfitting; deep subnetworks try to maintain high-degree terms, ensuring high description capacity. Experiments on ResNets and DenseNets show that shallow subnetworks are trained firstly and play important roles in the training of the networks. The experiments also reveal the reason why DenseNets outperform ResNets: The subnetworks playing vital roles in the training of the former are shallower than those in the training of the latter. Furthermore, LDTF can also be used to explain the working mechanism of other ResNet variants (SE-ResNets and SK-ResNets), and the common phenomena occurring in many neural networks. (C)& nbsp;& nbsp;2022 Elsevier Ltd. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据