4.4 Article

Domain-specific machine translation with recurrent neural network for software localization

Journal

EMPIRICAL SOFTWARE ENGINEERING
Volume 24, Issue 6, Pages 3514-3545

Publisher

SPRINGER
DOI: 10.1007/s10664-019-09702-z

Keywords

Software localization; Neural machine translation; Mobile apps

Ask authors/readers for more resources

Software localization is the process of adapting a software product to the linguistic, cultural and technical requirements of a target market. It allows software companies to access foreign markets that would be otherwise difficult to penetrate. Many studies have been carried out to locate need-to-translate strings in software and adapt UI layout after text translation in the new language. However, no work has been done on the most important and time-consuming step of software localization process, i.e., the translation of software text. Due to some unique characteristics of software text, for example, application-specific meanings, context-sensitive translation, domain-specific rare words, general machine translation tools such as Google Translate cannot properly address linguistic and technical nuance in translating software text for software localization. In this paper, we propose a neural-network based translation model specifically designed and trained for mobile application text translation. We collect large-scale human-translated bilingual sentence pairs inside different Android applications, which are crawled from Google Play store. We customize the original RNN encoder-decoder neural machine translation model by adding categorical information addressing the domain-specific rare word problem which is common phenomenon in software text. We evaluate our approach in translating the text of testing Android applications by both BLEU score and exact match rate. The results show that our method outperforms the general machine translation tool, Google Translate, and generates more acceptable translation for software localization with less needs for human revision. Our approach is language independent, and we show the generality of our approach between English and the other five official languages used in United Nation (UN).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Software Engineering

Nighthawk: Fully Automated Localizing UI Display Issues via Visual Understanding

Zhe Liu, Chunyang Chen, Junjie Wang, Yuekai Huang, Jun Hu, Qing Wang

Summary: Graphical User Interface (GUI) serves as a visual link between software application and users, enabling interaction between them. However, the complexity of GUI poses challenges to its implementation, as display issues often occur due to software or hardware compatibility. To address this, a fully automated approach called Nighthawk is proposed, which uses deep learning to detect and locate display issues in GUI screenshots for developers to fix. Additionally, a heuristic-based training data auto-generation method is introduced to generate labeled training data. Evaluation shows that Nighthawk achieves high precision and recall in detecting UI display issues and successfully uncovers previously-undetected issues in popular Android apps.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

Article Computer Science, Software Engineering

How secondary school girls perceive Computational Thinking practices through collaborative programming with the micro:bit

Mojtaba Shahin, Christabel Gonsalvez, Jon Whittle, Chunyang Chen, Li Li, Xin Xia

Summary: This study investigates how secondary school girls perceive computational thinking practices when using the micro:bit device in a collaborative setting. It identifies challenges the girls face and best practices they adopt while working on computational solutions.

JOURNAL OF SYSTEMS AND SOFTWARE (2022)

Article Computer Science, Information Systems

Consistent or not? An investigation of using Pull Request Template in GitHub

Mengxi Zhang, Huaxiao Liu, Chunyang Chen, Yuzhou Liu, Shuotong Bai

Summary: The study found that only 1.2% of GitHub repositories contain the PRT, mainly in high popularity and with a large number of PRs. Contributors are willing to accept the PRT that requires key information, while it also helps to manage repositories, leading to shorter review time, fewer duplicated pull requests, and minimal invalid comments.

INFORMATION AND SOFTWARE TECHNOLOGY (2022)

Article Computer Science, Software Engineering

Context-Aware Personalized Crowdtesting Task Recommendation

Junjie Wang, Ye Yang, Song Wang, Chunyang Chen, Dandan Wang, Qing Wang

Summary: Crowdsourced software testing, also known as crowdtesting, is a specialized form of crowdsourcing that requires skilled and dedicated crowdworkers. This paper addresses the issue of inappropriate task selection in crowdtesting, which leads to unpaid and wasted effort. The authors propose a context-aware personalized task recommendation approach called PTRec, which leverages a testing context model and a learning-based recommendation model to help crowdworkers make informed decisions. The evaluation of PTRec on a large crowdtesting platform demonstrates its potential in improving bug detection efficiency and increasing crowdworkers' earnings.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2022)

Article Computer Science, Software Engineering

Accessible or Not? An Empirical Investigation of Android App Accessibility

Sen Chen, Chunyang Chen, Lingling Fan, Mingming Fan, Xian Zhan, Yang Liu

Summary: Mobile apps provide new opportunities for people with disabilities to act independently, but there are still accessibility issues that need to be addressed. This study proposes an automated app page exploration tool to collect a comprehensive dataset of accessibility issues and investigates the characteristics of these issues. The findings highlight the importance of maintaining mobile app accessibility for users, especially the elderly and disabled.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2022)

Proceedings Paper Computer Science, Information Systems

Guided Bug Crush: Assist Manual GUI Testing of Android Apps via Hint Moves

Zhe Liu, Chunyang Chen, Junjie Wang, Yuekai Huang, Jun Hu, Qing Wang

Summary: Mobile apps are essential for daily life, and manual testing plays a crucial role in ensuring app quality. However, manual testing can be time-consuming and inefficient due to repeated actions and missed functionalities. Inspired by the game candy crush, NaviDroid proposes an approach that guides testers with highlighted next operations for more effective and efficient testing.

PROCEEDINGS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI' 22) (2022)

Article Computer Science, Information Systems

An Empirical Study on How Well Do COVID-19 Information Dashboards Service Users' Information Needs

Xinyan Li, Han Wang, Chunyang Chen, John Grundy

Summary: The ongoing COVID-19 pandemic has highlighted the importance of dashboards in providing real-time information. However, there is often a gap between the information needs of the public and the supply provided by existing COVID-19 dashboards. Through an empirical study comparing people's needs on Twitter with existing information suppliers, we found that people are interested in various aspects beyond COVID-19, such as the relationship between COVID-19 and other viruses, its origin, vaccine development, fake news, and its impact on women, schools/universities, and businesses. We also identified common visualization and interaction patterns used in dashboards, which can help developers optimize their designs to meet people's needs and improve future crisis management dashboard development.

IEEE TRANSACTIONS ON SERVICES COMPUTING (2022)

Article Computer Science, Hardware & Architecture

GUI-Squatting Attack: Automated Generation of Android Phishing Apps

Sen Chen, Lingling Fan, Chunyang Chen, Minhui Xue, Yang Liu, Lihua Xu

Summary: Mobile phishing attacks using disguise techniques have raised security concerns, with current detection methods potentially vulnerable. A new attack technique, GUI-Squatting attack, can automatically generate phishing apps on the Android platform using deep learning algorithms. Experimental results suggest existing phishing defenses are ineffective against emergent attacks, stimulating the need for more efficient detection techniques.

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING (2021)

Proceedings Paper Computer Science, Software Engineering

UIS-Hunter: Detecting UI Design Smells in Android Apps

Bo Yang, Zhenchang Xing, Xin Xia, Chunyang Chen, Deheng Ye, Shanping Li

Summary: Visual design smells in UI design indicate violations of good design guidelines. By following a design system, developers can avoid common design issues. An automated UI design smell detector helps identify and address UI design problems.

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2021) (2021)

Article Computer Science, Software Engineering

SEthesaurus: WordNet in Software Engineering

Xiang Chen, Chunyang Chen, Dun Zhang, Zhenchang Xing

Summary: This paper proposes an automatic unsupervised approach to build a thesaurus for software engineering text, utilizing software-specific and general corpora to identify terms, infer morphological forms, and perform graph analysis. Experimental results show high coverage and accuracy of the approach, confirmed through manual verification of abbreviations and synonyms in the thesaurus.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2021)

Proceedings Paper Computer Science, Software Engineering

Don't Do That! Hunting Down Visual Design Smells in Complex UIs against Design Guidelines

Bo Yang, Zhenchang Xing, Xin Xia, Chunyang Chen, Deheng Ye, Shanping Li

Summary: The study revealed that Material Design guidelines extend beyond UI aesthetics, covering seven general design dimensions and four component design aspects. Violating these guidelines leads to visual design smells in UIs. The automated UI design smell detector UIS-Hunter has high detection accuracy and helps developers learn best practices for Material Design.

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) (2021)

Proceedings Paper Computer Science, Software Engineering

GUIGAN: Learning to Generate GUI Designs Using Generative Adversarial Networks

Tianming Zhao, Chunyang Chen, Yuanning Liu, Xiaodong Zhu

Summary: Graphical User Interface (GUI) is essential in modern software, and a good GUI design is crucial for software success. Automated generated GUIs can enhance design personalization, and a model called GUIGAN has been developed to automatically generate GUI designs similar to natural language generation.

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) (2021)

Proceedings Paper Computer Science, Software Engineering

DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection

Yuanchun Li, Liayi Hua, Haoyu Wang, Chunyang Chen, Yunxin Liu

Summary: This paper introduces a highly practical backdoor attack achieved with reverse-engineering techniques over compiled deep learning models, showing its effectiveness and vulnerability of real-world mobile deep learning apps.

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) (2021)

Proceedings Paper Computer Science, Software Engineering

Automated Query Reformulation for Efficient Search based on Query Logs From Stack Overflow

Kaibo Cao, Chunyang Chen, Sebastian Baltes, Christoph Treude, Xiang Chen

Summary: The research found that the difficulty for developers to efficiently search for the information they need on Stack Overflow mainly stems from the gap between user intentions and text meanings, as well as the semantic gap between queries and post content. To address this issue, an automated software-specific query reformulation approach based on deep learning was proposed, which can generate candidate reformulated queries when given the user's original query. Experimental results demonstrated significant improvements in terms of ExactMatch and GLEU.

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) (2021)

Proceedings Paper Computer Science, Software Engineering

Robustness of on-device Models: Adversarial Attack to Deep Learning Models on Android Apps

Yujin Huang, Han Hu, Chunyang Chen

Summary: The study shows that embedding deep learning models in mobile applications, such as Android apps, may be vulnerable to adversarial attacks. The experiment demonstrates that hackers can successfully attack real-world Android apps by identifying pre-trained models.

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2021) (2021)

No Data Available