4.7 Article

ERA-LSTM: An Efficient ReRAM-Based Architecture for Long Short-Term Memory

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPDS.2019.2962806

Keywords

Long short-term memory (LSTM); resistive random-access memory (ReRAM); processing in memory (PIM); approximate computing; accelerator

Funding

  1. Beijing Academy of Artificial Intelligence (BAAI)
  2. Beijing Innovation Center for Future Chips, Tsinghua University
  3. Science and Technology Innovation Special Zone Project, China
  4. Tsinghua University Initiative Scientific Research Program [2018Z05JDX005]
  5. China Postdoctoral Science Foundation [2019M650030]

Ask authors/readers for more resources

Processing-in-memory (PIM) architecture based on resistive random access memory (ReRAM) crossbars is a promising solution to the memory bottleneck that long short-term memory (LSTM) faces. Based on the dataflow analysis of the LSTM computing paradigm, this article proposes to adopt the ReRAM-based analog approximate computing to conduct the LSTM-specific element-wise computation. Combined with the dot-product computation implemented with ReRAM crossbars, a new LSTM processing tile is designed to significantly reduce the demand for analog-to-digital converters (ADCs), which is the major part of power consumption of existing designs. Next, we elaborate on a mapping scheme to efficiently deploy large-scale LSTM onto multiple processing tiles. Finally, an architecture enhancement is proposed to support crossbar-friendly LSTM pruning to further improve efficiency. This overall design, named ERA-LSTM, is presented. Our evaluation shows that it can outperform two state-of-the-art FPGA-based LSTM accelerators by 103.6 and 35.9 times, respectively; compared with a state-of-the-art ReRAM-based LSTM accelerator with digital element-wise computation, it is 6.1 times more efficient. Moreover, our experiments demonstrate that the impact of hardware constraints and approximation errors on the inference accuracy can be effectively reduced by the proposed fine-tuning scheme and by optimizing the design of the approximator.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available