☆ 4.7 Article

Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS (2019)

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS

卷 66, 期 8, 页码 3064-3076

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCSI.2019.2907488

关键词

In-memory computing; SRAM; binary convolution; binary neural networks; deep-CNNs

类别

Engineering, Electrical & Electronic

资金

C-BRIC, one of six centers in JUMP
Semiconductor Research Corporation (SRC) Program - DARPA
National Science Foundation
Intel Corporation
Vannevar Bush Faculty Fellowship

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Deep neural networks are biologically inspired class of algorithms that have recently demonstrated the state-of-the-art accuracy in large-scale classification and recognition tasks. Hardware acceleration of deep networks is of paramount importance to ensure their ubiquitous presence in future computing platforms. Indeed, a major landmark that enables efficient hardware accelerators for deep networks is the recent advances from the machine learning community that have demonstrated the viability of aggressively scaled deep binary networks. In this paper, we demonstrate how deep binary networks can be accelerated in modified von Neumann machines by enabling binary convolutions within the static random access memory (SRAM) arrays. In general, binary convolutions consist of bit-wise exclusive-NOR (XNOR) operations followed by a population count (popcount). We present two proposals: one based on charge sharing approach to perform vector XNOR and approximate popcount and another based on bit-wise XNOR followed by a digital bit-tree adder for accurate popcount. We highlight the various tradeoffs in terms of circuit complexity, speed-up, and classification accuracy for both the approaches. Few key techniques presented as a part of the manuscript are the use of low-precision, low-overhead analog-todigital converter (ADC), to achieve a fairly accurate popcount for the charge-sharing scheme and proposal for sectioning of the SRAM array by adding switches onto the read-bitlines, thereby achieving improved parallelism. Our results on benchmark image classification datasets for CIFAR-10 and SVHN on a binarized neural network architecture show energy improvements of up to 6.1x and 2.3x for the two proposals, compared to conventional SRAM banks. In terms of latency, improvements of up to 15.8x and 8.1x were achieved for the two respective proposals.

Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Xcel-RAM: Accelerating Binary Neural Networks in High-Throughput SRAM Compute Arrays

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文