4.7 Article

CU Partition Mode Decision for HEVC Hardwired Intra Encoder Using Convolution Neural Network

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING
Volume 25, Issue 11, Pages 5088-5103

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2016.2601264

Keywords

HEVC; fast CU/PU mode decision; CNN; VLSI; intra encoding

Funding

  1. Huawei Technologies, National Science and Technology Major Project [2016YFB0200505]
  2. National Natural Science Foundation of China [61325003]

Ask authors/readers for more resources

The intensive computation of High Efficiency Video Coding (HEVC) engenders challenges for the hardwired encoder in terms of the hardware overhead and the power dissipation. On the other hand, the constrains in hardwired encoder design seriously degrade the efficiency of software oriented fast coding unit (CU) partition mode decision algorithms. A fast algorithm is attributed as VLSI friendly, when it possesses the following properties. First, the maximum complexity of encoding a coding tree unit (CTU) could be reduced. Second, the parallelism of the hardwired encoder should not be deteriorated. Third, the process engine of the fast algorithm must be of low hardware-and power-overhead. In this paper, we devise the convolution neural network based fast algorithm to decrease no less than two CU partition modes in each CTU for full rate-distortion optimization (RDO) processing, thereby reducing the encoder's hardware complexity. As our algorithm does not depend on the correlations among CU depths or spatially nearby CUs, it is friendly to the parallel processing and does not deteriorate the rhythm of RDO pipelining. Experiments illustrated that, an averaged 61.1% intraencoding time was saved, whereas the Bjontegaard-Delta bit-rate augment is 2.67%. Capitalizing on the optimal arithmetic representation, we developed the high-speed [714 MHz in the worst conditions (125 degrees C, 0.9 V)] and low-cost (42.5k gate) accelerator for our fast algorithm by using TSMC 65-nm CMOS technology. One accelerator could support HD1080p at 55 frames/s real-time encoding. The corresponding power dissipation was 16.2 mW at 714 MHz. Finally, our accelerator is provided with good scalability. Four accelerators fulfill the throughput requirements of UltraHD-4K at 55 frames/s.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Information Systems

Modeling IPv6 adoption from biological evolution

Dujuan Gu, Jinhe Su, Yibo Xue, Dongsheng Wang, Jun Li, Ze Luo, Baoping Yan

COMPUTER COMMUNICATIONS (2020)

Article Automation & Control Systems

Multi-camera visual SLAM for off-road navigation

Yi Yang, Di Tang, Dongsheng Wang, Wenjie Song, Junbo Wang, Mengyin Fu

ROBOTICS AND AUTONOMOUS SYSTEMS (2020)

Article Optics

54 W nanosecond Yb-doped all-fiber amplifier at ultra-high repetition rate (tens of MHz) based on mode-locked fiber oscillator and single-mode fiber stretcher

Min Yang, Pingxue Li, Shun Li, Wenhao Xiong, Kaixuan Wang, Chuanfei Yao, Dongsheng Wang

Summary: This all-fiber amplifier utilizes a passively mode-locked fiber oscillator and an 8 km single-mode fiber stretcher to deliver nanosecond pulses at an ultra-high repetition rate.

LASER PHYSICS (2021)

Article Environmental Sciences

Effects of chemical modification on physicochemical properties and adsorption behavior of sludge-based activated carbon

Chunxu Wu, Lanfeng Li, Hao Zhou, Jing Ai, Hongtao Zhang, Jialin Tao, Dongsheng Wang, Weijun Zhang

Summary: This study investigated the adsorption performance of sludge-based activated carbon (SBC) on dissolved organic matters (DOMs) removal from sewage, and the modification effect of different types of chemicals on the structure of synthesized SBC. Chemical activation significantly improved the adsorption capacity of MSBC on humic acids (HA) and aromatic proteins (APN), showcasing the importance of surface functional groups on the adsorption capacities of MSBC towards DOMs removal in sewage. Additionally, the study examined the residual molecular weight of DOMs in sewage, showing varying effectiveness of different chemical modifications on different organic matter weights.

JOURNAL OF ENVIRONMENTAL SCIENCES (2021)

Article Computer Science, Hardware & Architecture

VoltJockey: A New Dynamic Voltage Scaling-Based Fault Injection Attack on Intel SGX

Pengfei Qiu, Dongsheng Wang, Yongqiang Lyu, Ruidong Tian, Chunlu Wang, Gang Qu

Summary: The study introduces an attack method to break SGX by inducing voltage-oriented hardware faults, which can be controlled completely by software without requiring any security vulnerabilities in the software. By providing transient low voltage to the processor through a controller module and injecting transient faults into the program running in the enclave, the attack is successfully executed to steal the key.

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS (2021)

Article Engineering, Electrical & Electronic

DynaComm: Accelerating Distributed CNN Training Between Edges and Clouds Through Dynamic Communication Scheduling

Shangming Cai, Dongsheng Wang, Haixia Wang, Yongqiang Lyu, Guangquan Xu, Xi Zheng, Athanasios V. Vasilakos

Summary: Edge deep learning is an emerging topic where edge devices collaboratively train a shared model. However, distributed training over edge networks is time-consuming due to transmission procedures. To address this, we propose DynaComm, a scheduler that optimizes communication and computation overlap for efficient training at the network edge.

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS (2022)

Article Computer Science, Information Systems

Learning with joint cross-document information via multi-task learning for named entity recognition

Dongsheng Wang, Hongjie Fan, Junfei Liu

Summary: The study introduces a cross-document NER model that improves entity recognition accuracy by establishing internal relationships and calculating cross-document representations. By adding a multi-classification auxiliary task and employing multi-objective optimization, the model performance and effectiveness are enhanced.

INFORMATION SCIENCES (2021)

Article Chemistry, Physical

Electrical impedance spectroscopy as a potential tool to investigate the structure and size of aggregates during water and wastewater treatment

Daxin Zhang, Yili Wang, Junyi Li, Xiaoyang Fan, Enrui Li, Shuoxun Dong, Weiwen Yin, Dongsheng Wang, Baoyou Shi

Summary: This study proposed an electrical impedance spectroscopy (EIS) method and constructed a generalized framework to associate macroscale electrical properties with microscopic structure and size-related characteristics of aggregates. The models extracted via EIS were capable of describing the self similarity of aggregates and capturing the fractal and size information. The EIS method exhibited a wide range of applications in water and wastewater treatment.

JOURNAL OF COLLOID AND INTERFACE SCIENCE (2022)

Article Mathematical & Computational Biology

Software Defect Prediction Based on Hybrid Swarm Intelligence and Deep Learning

Zhen Li, Tong Li, YuMei Wu, Liu Yang, Hong Miao, DongSheng Wang

Summary: This paper implements software defect prediction based on deep learning, combining the particle swarm algorithm and the wolf swarm algorithm for optimization, resulting in better performance indicators. By using a hybrid algorithm in the search for model hyperparameter optimization, the model's performance is enhanced.

COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE (2021)

Article Computer Science, Information Systems

A Reliability-Enhanced Differential Sensing Amplifier for Hybrid CMOS/MTJ Logic Circuits

Chengzhi Wang, Tianqi Yang, Min Han, Dongsheng Wang

Summary: Recently, hybrid logic circuits based on magnetic tunnel junctions (MTJs) have been investigated to reduce standby power. However, these circuits face reliability issues due to limited TMR ratio of the MTJ and process variation in deep sub-micrometer technology node. This paper proposes a novel differential sensing amplifier (DSA) that achieves a large sensing margin by incorporating two PMOS transistors and demonstrates its functionality and performance through simulations.

ELECTRONICS (2023)

Proceedings Paper Computer Science, Hardware & Architecture

SSB-Tree: Making Persistent Memory B plus -Trees Crash-Consistent and Concurrent by Lazy-Box

Tongliang Li, Haixia Wang, Airan Shao, Dongsheng Wang

Summary: This paper discusses the challenges faced by Persistent Memory (PM) B+-tree designs and introduces a new variant called Side-to-Side B+Tree (SSB-Tree) that utilizes Lazy-Box technology to reduce consistency cost and achieve efficient concurrency protocol and instant recovery with a single 8-byte write. The experimental results show that SSB-Tree outperforms other state-of-the-art PM B+-trees in terms of throughput in various benchmark tests.

2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022) (2022)

Proceedings Paper Computer Science, Hardware & Architecture

Privacy-Preserving Public Auditing for Shared Data in Mobile Cloud Storage

Xia'nan Zhao, Dongsheng Wang

Summary: In this paper, a new identity privacy preserving public auditing scheme is proposed to ensure the integrity of group data in mobile cloud storage services. By introducing a third party public auditor and utilizing the Diffie-Hellman Key Exchange protocol, the scheme supports efficient group dynamics while protecting user privacy.

2022 IEEE/ACM 7TH SYMPOSIUM ON EDGE COMPUTING (SEC 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Collusion-Tolerant Data Aggregation Method for Smart Grid

Liyuan Cao, Yingwen Chen, Kaiyu Cai, Dongsheng Wang, Yuchuan Luo, Guangtao Xue

Summary: In the smart grid, smart meters record and transmit real-time electricity consumption data to the control center. To protect user privacy, researchers propose privacy-preserving data aggregation schemes based on aggregate values instead of raw data. However, these schemes overlook the problem of collusion. This paper proposes a collusion-tolerant and privacy-preserving data aggregation scheme for the smart grid, which effectively protects the privacy of customer usage data even under collusions.

WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS (WASA 2022), PT I (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Drug-Drug Interaction Extraction via Attentive Capsule Network with an Improved Sliding-Margin Loss

Dongsheng Wang, Hongjie Fan, Junfei Liu

Summary: This paper introduces a new approach for DDI extraction using sequence features, dependency characteristics, capsule network, and an improved loss function, which effectively enhances the performance of DDI extraction.

DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II (2021)

Article Computer Science, Information Systems

Software Defects Prediction Based on Hybrid Particle Swarm Optimization and Sparrow Search Algorithm

Liu Yang, Zhen Li, Dongsheng Wang, Hong Miao, Zhaobin Wang

Summary: The paper focuses on software quality, software failure prediction, and software reliability model parameter estimation, proposing a hybrid algorithm (SSA-PSO) for software defect prediction. Experimental results show that the hybrid algorithm has faster convergence speed, more stable, accurate results, solving the issues present in traditional single algorithms.

IEEE ACCESS (2021)

No Data Available