4.6 Article

Lattice-to-sequence attentional Neural Machine Translation models

Journal

NEUROCOMPUTING
Volume 284, Issue -, Pages 138-147

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2018.01.010

Keywords

Neural Machine Translation; Word lattice; Recurrent Neural Network; Gated Recurrent Unit

Funding

  1. Natural Science Foundation of China [61573294, 61672440]
  2. Ph.D. Programs Foundation of Ministry of Education of China [20130121110040]
  3. Foundation of the State Language Commission of China [WT135-10, YB135-49]
  4. Natural Science Foundation of Fujian Province [2016J05161]
  5. Fund of Research Project of Tibet Autonomous Region of China [Z2014A18G2-13]
  6. National Key Technology RD Program [2012BAH14F03]

Ask authors/readers for more resources

The dominant Neural Machine Translation (NMT) models usually resort to word-level modeling to embed input sentences into semantic space. However, it may not be optimal for the encoder modeling of NMT, especially for languages where tokenizations are usually ambiguous: On one hand, there may be tokenization errors which may negatively affect the encoder modeling of NMT. On the other hand, the optimal tokenization granularity is unclear for NMT. In this paper, we propose lattice-to-sequence attentional NMT models, which generalize the standard Recurrent Neural Network (RNN) encoders to lattice topology. Specifically, they take as input a word lattice which compactly encodes many tokenization alternatives, and learn to generate the hidden state for the current step from multiple inputs and hidden states in previous steps. Compared with the standard RNN encoder, the proposed encoders not only alleviate the negative impact of tokenization errors but are more expressive and flexible as well for encoding the meaning of input sentences. Experimental results on both Chinese-English and Japanese-English translations demonstrate the effectiveness of our models. (C) 2018 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Artificial Intelligence

Boosting implicit discourse relation recognition with connective-based word embeddings

Changxing Wu, Jinsong Su, Yidong Chen, Xiaodong Shi

NEUROCOMPUTING (2019)

Article Computer Science, Artificial Intelligence

Multi-perspective neural architecture for recommendation system

Han Xiao, Yidong Chen, Xiaodong Shi, Ge Xu

NEURAL NETWORKS (2019)

Article Computer Science, Interdisciplinary Applications

How many preprints have actually been printed and why: a case study of computer science preprints on arXiv

Jialiang Lin, Yao Yu, Yu Zhou, Zhiyang Zhou, Xiaodong Shi

SCIENTOMETRICS (2020)

Article Mathematical & Computational Biology

An Improved Sign Language Translation Model with Explainable Adaptations for Processing Long Sign Sentences

Jiangbin Zheng, Zheng Zhao, Min Chen, Jing Chen, Chong Wu, Yidong Chen, Xiaodong Shi, Yiqi Tong

COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE (2020)

Article Computer Science, Artificial Intelligence

Knowledge Graph Embedding Based on Multi-View Clustering Framework

Han Xiao, Yidong Chen, Xiaodong Shi

Summary: Knowledge representation is a critical issue in knowledge engineering and artificial intelligence, with knowledge embedding methods playing an important role. This paper introduces a semantic model based on multi-view clustering for generating semantic representations of knowledge elements and improving entity retrieval. Extensive experiments demonstrate substantial improvements of this model against baselines on various knowledge graph tasks.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2021)

Article Materials Science, Multidisciplinary

Effect of metal type on the energy absorption of fiber metal laminates under low-velocity impact

Yong Chen, Liming Chen, Qiong Huang, Zhigang Zhang

Summary: In FMLs, replacing aluminum with magnesium leads to faster perforation and energy dissipation, but also reduces delamination damage at the metal-composite interface.

MECHANICS OF ADVANCED MATERIALS AND STRUCTURES (2021)

Article Computer Science, Artificial Intelligence

Enhancing Neural Sign Language Translation by highlighting the facial expression information

Jiangbin Zheng, Yidong Chen, Chong Wu, Xiaodong Shi, Suhail Muhammad Kamal

Summary: Neural Sign Language Translation (SLT) models often overlook non-manual features such as facial expressions, leading to translation errors. This paper proposes two novel schemes to enhance the performance of traditional SLT models with a focus on facial expression information. Experimental results show significant improvements in translation accuracy.

NEUROCOMPUTING (2021)

Article Computer Science, Artificial Intelligence

Automatic Analysis of Available Source Code of Top Artificial Intelligence Conference Papers

Jialiang Lin, Yingmin Wang, Yao Yu, Yu Zhou, Yidong Chen, Xiaodong Shi

Summary: Source code is crucial for researchers to reproduce and replicate the results of AI papers. To address the labor-intensive and time-consuming task of manual collection, researchers propose a method to automatically identify and extract source code from papers. They find that 20.5% of top AI conference papers have available source code, but 8.1% of the source code repositories are no longer accessible.

INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING (2022)

Article Computer Science, Interdisciplinary Applications

A downsampling method enables robust clustering and integration of single-cell transcriptome data

Jun Ren, Quan Zhang, Ying Zhou, Yudi Hu, Xuejing Lyu, Hongkun Fang, Jing Yang, Rongshan Yu, Xiaodong Shi, Qiyuan Li

Summary: Research shows that the proposed MURPXMBD algorithm can reduce noise in single-cell RNA sequencing data, improve the quality and accuracy of clustering algorithms, help discover new cell types, and enhance the performance of dataset integration algorithms.

JOURNAL OF BIOMEDICAL INFORMATICS (2022)

Article Computer Science, Interdisciplinary Applications

Detecting and analyzing missing citations to published scientific entities

Jialiang Lin, Yao Yu, Jiaxin Song, Xiaodong Shi

Summary: Proper citation is crucial for academic writing to accumulate knowledge and maintain academic integrity. This study proposes a method called Citation Recommendation for Published Scientific Entity (CRPSE), which utilizes cooccurrences between published scientific entities and in-text citations from previous researchers to effectively recommend source papers. A statistical analysis of missing citations in prestigious computer science conferences in 2020 reveals that 475 published scientific entities in computer science and mathematics lack proper citations. It is found that many entities mentioned without citations are well-accepted research results.

SCIENTOMETRICS (2022)

Proceedings Paper Computer Science, Artificial Intelligence

FCH-TTS: Fast, Controllable and High-quality Non-Autoregressive Text-to-Speech Synthesis

Xun Zhou, Zhiyang Zhou, Xiaodong Shi

Summary: Inspired by the success of FastSpeech, this paper proposes FCH-TTS, a fast, controllable, and universal neural text-to-speech model that can generate high-quality spectrograms. Unlike FastSpeech, FCH-TTS uses a simpler attention-based soft alignment mechanism to improve its adaptability to different languages. It also introduces a fusion module to better model speaker features and ensure the desired timbre. Experimental results demonstrate that FCH-TTS achieves the fastest inference speed and the best speech quality compared to baseline models.

2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Continuous Prompt Enhanced Biomedical Entity Normalization

Zhaohong Lai, Biao Fu, Shangfei Wei, Xiaodong Shi

Summary: This paper proposes a framework called Prompt-BEN that enhances biomedical entity normalization using continuous prompts. The method fine-tunes only a few parameters and utilizes embeddings with continuous prefix prompts to capture semantic similarity. It also designs a contrastive loss with a synonym marginalized strategy for the BEN task. Experimental results demonstrate that the method achieves competitive or even greater linking accuracy compared to state-of-the-art fine-tuning-based models while having about 600 times fewer tuned parameters.

NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT II (2022)

Proceedings Paper Computer Science, Artificial Intelligence

A Document-Level Machine Translation Quality Estimation Model Based on Centering Theory

Yidong Chen, Enjun Zhong, Yiqi Tong, Yanru Qiu, Xiaodong Shi

Summary: This paper introduces a novel document-level machine translation Quality Estimation (QE) model based on Centering Theory (CT), and releases an open-source Chinese-English corpus for document-level machine translation QE. Experimental results demonstrate that the proposed model outperforms the baseline model significantly.

MACHINE TRANSLATION, CCMT 2021 (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Generating Diverse Back-Translations via Constraint Random Decoding

Yiqi Tong, Yidong Chen, Guocheng Zhang, Jiangbin Zheng, Hongkang Zhu, Xiaodong Shi

Summary: Back-translation is an effective data augmentation method for improving the performance of Neural Machine Translation (NMT). By proposing a constraint random decoding method and using an evolution decoding algorithm, more diverse synthetic sentences can be generated while maintaining quality.

MACHINE TRANSLATION, CCMT 2021 (2021)

Review Computer Science, Information Systems

Technical Approaches to Chinese Sign Language Processing: A Review

Suhail Muhammad Kamal, Yidong Chen, Shaozi Li, Xiaodong Shi, Jiangbin Zheng

IEEE ACCESS (2019)

Article Computer Science, Artificial Intelligence

3D-KCPNet: Efficient 3DCNNs based on tensor mapping theory

Rui Lv, Dingheng Wang, Jiangbin Zheng, Zhao-Xu Yang

Summary: In this paper, the authors investigate tensor decomposition for neural network compression. They analyze the convergence and precision of tensor mapping theory, validate the rationality of tensor mapping and its superiority over traditional tensor approximation based on the Lottery Ticket Hypothesis. They propose an efficient method called 3D-KCPNet to compress 3D convolutional neural networks using the Kronecker canonical polyadic (KCP) tensor decomposition. Experimental results show that 3D-KCPNet achieves higher accuracy compared to the original baseline model and the corresponding tensor approximation model.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Personalized robotic control via constrained multi-objective reinforcement learning

Xiangkun He, Zhongxu Hu, Haohan Yang, Chen Lv

Summary: In this paper, a novel constrained multi-objective reinforcement learning algorithm is proposed for personalized end-to-end robotic control with continuous actions. The approach trains a single model using constraint design and a comprehensive index to achieve optimal policies based on user-specified preferences.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Overlapping community detection using expansion with contraction

Zhijian Zhuo, Bilian Chen, Shenbao Yu, Langcai Cao

Summary: In this paper, a novel method called Expansion with Contraction Method for Overlapping Community Detection (ECOCD) is proposed, which utilizes non-negative matrix factorization to obtain disjoint communities and applies expansion and contraction processes to adjust the degree of overlap. ECOCD is applicable to various networks with different properties and achieves high-quality overlapping community detection.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

High-compressed deepfake video detection with contrastive spatiotemporal distillation

Yizhe Zhu, Chunhui Zhang, Jialin Gao, Xin Sun, Zihan Rui, Xi Zhou

Summary: In this work, the authors propose a Contrastive Spatio-Temporal Distilling (CSTD) approach to improve the detection of high-compressed deepfake videos. The approach leverages spatial-frequency cues and temporal-contrastive alignment to fully exploit spatiotemporal inconsistency information.

NEUROCOMPUTING (2024)

Review Computer Science, Artificial Intelligence

A review of coverless steganography

Laijin Meng, Xinghao Jiang, Tanfeng Sun

Summary: This paper provides a review of coverless steganographic algorithms, including the development process, known contributions, and general issues in image and video algorithms. It also discusses the security of coverless steganography from theoretical analysis to actual investigation for the first time.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Confidence-based interactable neural-symbolic visual question answering

Yajie Bao, Tianwei Xing, Xun Chen

Summary: Visual question answering requires processing multi-modal information and effective reasoning. Neural-symbolic learning is a promising method, but current approaches lack uncertainty handling and can only provide a single answer. To address this, we propose a confidence based neural-symbolic approach that evaluates NN inferences and conducts reasoning based on confidence.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

A framework-based transformer and knowledge distillation for interior style classification

Anh H. Vo, Bao T. Nguyen

Summary: Interior style classification is an interesting problem with potential applications in both commercial and academic domains. This project proposes a method named ISC-DeIT, which combines data-efficient image transformer architectures and knowledge distillation, to address the interior style classification problem. Experimental results demonstrate a significant improvement in predictive accuracy compared to other state-of-the-art methods.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Improving robustness for vision transformer with a simple dynamic scanning augmentation

Shashank Kotyan, Danilo Vasconcellos Vargas

Summary: This article introduces a novel augmentation technique called Dynamic Scanning Augmentation to improve the accuracy and robustness of Vision Transformer (ViT). The technique leverages dynamic input sequences to adaptively focus on different patches, resulting in significant changes in ViT's attention mechanism. Experimental results demonstrate that Dynamic Scanning Augmentation outperforms ViT in terms of both robustness to adversarial attacks and accuracy against natural images.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Introducing shape priors in Siamese networks for image classification

Hiba Alqasir, Damien Muselet, Christophe Ducottet

Summary: The article proposes a solution to improve the learning process of a classification network by providing shape priors, reducing the need for annotated data. The solution is tested on cross-domain digit classification tasks and a video surveillance application.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Neural dynamics solver for time-dependent infinity-norm optimization based on ACP framework with robot application

Dexiu Ma, Mei Liu, Mingsheng Shang

Summary: This paper proposes a method using neural dynamics solvers to solve infinity-norm optimization problems. Two improved solvers are constructed and their effectiveness and superiority are demonstrated through theoretical analysis and simulation experiments.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

cpp-AIF: A multi-core C plus plus implementation of Active Inference for Partially Observable Markov Decision Processes

Francesco Gregoretti, Giovanni Pezzulo, Domenico Maisto

Summary: Active Inference is a computational framework that uses probabilistic inference and variational free energy minimization to describe perception, planning, and action. cpp-AIF is a header-only C++ library that provides a powerful tool for implementing Active Inference for Partially Observable Markov Decision Processes through multi-core computing. It is cross-platform and improves performance, memory management, and usability compared to existing software.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Predicting stock market trends with self-supervised learning

Zelin Ying, Dawei Cheng, Cen Chen, Xiang Li, Peng Zhu, Yifeng Luo, Yuqi Liang

Summary: This paper proposes a novel stock market trends prediction framework called SMART, which includes a self-supervised stock technical data sequence embedding model S3E. By training with multiple self-supervised auxiliary tasks, the model encodes stock technical data sequences into embeddings and uses the learned sequence embeddings for predicting stock market trends. Extensive experiments on China A-Shares market and NASDAQ market prove the high effectiveness of our model in stock market trends prediction, and its effectiveness is further validated in real-world applications in a leading financial service provider in China.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

DHGAT: Hyperbolic representation learning on dynamic graphs via attention networks

Hao Li, Hao Jiang, Dongsheng Ye, Qiang Wang, Liang Du, Yuanyuan Zeng, Liu Yuan, Yingxue Wang, C. Chen

Summary: DHGAT1, a dynamic hyperbolic graph attention network, utilizes hyperbolic metric properties to embed dynamic graphs. It employs a spatiotemporal self-attention mechanism and weighted node representations, resulting in excellent performance in link prediction tasks.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Progressive network based on detail scaling and texture extraction: A more general framework for image deraining

Jiehui Huang, Zhenchao Tang, Xuedong He, Jun Zhou, Defeng Zhou, Calvin Yu-Chian Chen

Summary: This study proposes a progressive learning multi-scale feature blending model for image deraining tasks. The model utilizes detail dilation and texture extraction to improve the restoration of rainy images. Experimental results show that the model achieves near state-of-the-art performance in rain removal tasks and exhibits better rain removal realism.

NEUROCOMPUTING (2024)

Article Computer Science, Artificial Intelligence

Stabilization and synchronization control for discrete-time complex networks via the auxiliary role of edges subsystem

Lizhi Liu, Zilin Gao, Yinhe Wang, Yongfu Li

Summary: This paper proposes a novel discrete-time interconnected model for depicting complex dynamical networks. The model consists of nodes and edges subsystems, which consider the dynamic characteristic of both nodes and edges. By designing control strategies and coupling modes, the stabilization and synchronization of the network are achieved. Simulation results demonstrate the effectiveness of the proposed methods.

NEUROCOMPUTING (2024)