4.5 Article

File-level socio-technical congruence and its relationship with bug proneness in OSS projects

Journal

JOURNAL OF SYSTEMS AND SOFTWARE
Volume 156, Issue -, Pages 21-40

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.jss.2019.05.030

Keywords

Socio-technical congruence; Coordination breakdown; Software quality; Open source software; Developer network

Funding

  1. National Natural Science Foundation of China [61772014, 61832009, 61802171]

Ask authors/readers for more resources

Coordination is important in software development. Socio-Technical Congruence (STC) is proposed to measure the match between coordination requirements and actual coordination activities. The previous work of Cataldo et al. computes STC in commercial projects and finds it related to software failures. In this paper, we study the relationship between file-level STC and bug proneness in Open Source Software (OSS) projects. We apply the fundamental STC framework to the OSS data setting and present a method of computing file-level STC based on our available data. We also propose a derivative STC metric called Missing Developer Links (MDL), which is to measure the amount of coordination breakdowns. In our empirical analysis on five OSS projects, we find that MDL is more related to bug proneness than STC. Furthermore, STC or MDL can be computed based on different types of file networks and developer networks, and we find out the best file network and the best developer network via an empirical study. We also evaluate the usefulness of STC or MDL metrics in bug prediction. This work is promising to help detect coordination issues in OSS projects. (C) 2019 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Software Engineering

To what extent do DNN-based image classification models make unreliable inferences?

Yongqiang Tian, Shiqing Ma, Ming Wen, Yepang Liu, Shing-Chi Cheung, Xiangyu Zhang

Summary: This study proposes a metamorphic testing approach to assess unreliable inferences in deep neural network models, finding that these unreliable inferences significantly degrade the overall accuracy of the models. Recommendations are made for developers to pay more attention to this issue during model evaluations.

EMPIRICAL SOFTWARE ENGINEERING (2021)

Article Computer Science, Artificial Intelligence

Multi-Constraint Adversarial Networks for Unsupervised Image-to-Image Translation

Divya Saxena, Tarun Kulshrestha, Jiannong Cao, Shing-Chi Cheung

Summary: In this paper, a novel multi-constraint adversarial model (MCGAN) is proposed for unsupervised image-to-image translation. The model utilizes multiple adversarial constraints applied at the generator's multi-scale outputs to capture large discrepancies in appearance between two domains. Experimental results on public datasets (cat-to-dog, horse-to-zebra, and apple-to-orange) demonstrate that the proposed method significantly improves state-of-the-arts.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2022)

Article Computer Science, Software Engineering

Runtime Permission Issues in Android Apps: Taxonomy, Practices, and Ways Forward

Ying Wang, Yibo Wang, Sinan Wang, Yepang Liu, Chang Xu, Shing-Chi Cheung, Hai Yu, Zhiliang Zhu

Summary: Android introduces a new permission model for runtime permissions, which presents challenges for app developers. Existing studies on runtime permission issues are still limited, and there is a need for comprehensive understanding and effective detection techniques. This study analyzes the common types of ARP issues in Android apps, their manifestations, pervasiveness, and fixes. The researchers also evaluate existing tools and identify their limitations for detecting ARP issues.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

Article Computer Science, Software Engineering

Historical Spectrum Based Fault Localization

Ming Wen, Junjie Chen, Yongqiang Tian, Rongxin Wu, Dan Hao, Shi Han, Shing-Chi Cheung

Summary: SBFL techniques have been proven effective in fault localization, but are limited by unclear root causes and lack of differentiation between buggy and non-buggy entities. To address these issues, HSFL leverages version history information for fault localization, resulting in significant performance improvement.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2021)

Article Computer Science, Software Engineering

SemMT: A Semantic-Based Testing Approach for Machine Translation Systems

Jialun Cao, Meiziniu Li, Yeting Li, Ming Wen, Shing-Chi Cheung, Haiming Chen

Summary: Machine translation has wide applications in daily life, but incorrect translation can have serious consequences. To address the testing problem of machine translation systems, this article proposes an automatic testing approach based on semantic similarity checking. Experimental comparisons show that the proposed method outperforms existing techniques, and the possibility of further performance improvement is studied. A solution to locate problems is also discussed.

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY (2022)

Article Computer Science, Software Engineering

Plumber: Boosting the Propagation of Vulnerability Fixes in the npm Ecosystem

Ying Wang, Peng Sun, Lin Pei, Yue Yu, Chang Xu, Shing-Chi Cheung, Hai Yu, Zhiliang Zhu

Summary: There are vulnerabilities in the npm ecosystem, and 20% of the packages still have potential vulnerabilities even though the involved vulnerable packages have published fix versions. Previous studies showed that the propagation speed of fix versions is influenced by various factors, but how to design an effective technique to accelerate the propagation of vulnerability fixes remains an open question. Therefore, this paper conducted an empirical study to investigate the characteristics of packages that block the propagation of vulnerability fixes and proposed a technique called Plumber to boost the propagation of vulnerability fixes.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

Article Computer Science, Software Engineering

Neural-FEBI: Accurate function identification in Ethereum Virtual Machine bytecode

Jiahao He, Shuangyin Li, Xinming Wang, Shing-Chi Cheung, Gansen Zhao, Jinji Yang

Summary: Millions of smart contracts have been deployed on the Ethereum platform, making them vulnerable to attacks. Analyzing contract binaries is crucial due to the lack of access to their source code, and it involves identifying function entries and detecting their boundaries. However, identifying functions from stripped contract binaries is challenging due to the absence of internal function call statements and compiler-induced instruction reshuffling.

JOURNAL OF SYSTEMS AND SOFTWARE (2023)

Article Computer Science, Software Engineering

Combatting Front-Running in Smart Contracts: Attack Mining, Benchmark Construction and Vulnerability Detector Evaluation

Wuqi Zhang, Lili Wei, Shing-Chi Cheung, Yepang Liu, Shuqing Li, Lu Liu, Michael R. R. Lyu

Summary: In this study, an effective algorithm is designed to mine real-world front-running attacks on the blockchain, and an automated and scalable vulnerability localization approach is proposed. A benchmark consisting of 513 real-world attacks with vulnerable code labeled in 235 smart contracts is built, and seven state-of-the-art vulnerability detection techniques are empirically evaluated. The evaluation reveals the inadequacy of existing techniques in detecting front-running vulnerabilities, with a low recall of 6.04%.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING (2023)

Proceedings Paper Computer Science, Software Engineering

Characterizing and Detecting Configuration Compatibility Issues in Android Apps

Huaxun Huang, Ming Wen, Lili Wei, Yepang Liu, Shing-Chi Cheung

Summary: The study found common patterns of Android framework code changes that can induce configuration compatibility issues. CONFDROID successfully extracts rules for detecting configuration compatibility issues, leading to the detection of numerous issues that cannot be found by current baselines.

2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021 (2021)

Proceedings Paper Computer Science, Software Engineering

Characterizing Transaction-Reverting Statements in Ethereum Smart Contracts

Lu Liu, Lili Wei, Wuqi Zhang, Ming Wen, Yepang Liu, Shing-Chi Cheung

Summary: Smart contracts, programs stored on blockchains for executing transactions, often utilize transaction-reverting statements for authority verifications and validity checks to ensure security. However, current smart contract security analyzers struggle to effectively handle such statements when detecting vulnerabilities. Further research is needed to understand the practical use and impact of transaction-reverting statements in smart contracts for improved quality assurance.

2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021 (2021)

Proceedings Paper Computer Science, Software Engineering

DArcher: Detecting On-Chain-Off-Chain Synchronization Bugs in Decentralized Applications

Wuqi Zhang, Lali Wei, Shuqing Li, Yepang Liu, Shing-Chi Cheung

Summary: Since the emergence of Ethereum, blockchain-based decentralized applications (DApps) have become increasingly popular and important. In this work, the challenges of synchronizing on-chain and off-chain data in Ethereum-based DApps are investigated. Two types of bugs that could result in inconsistencies between the on-chain and off-chain layers are presented. To help detect such bugs, a state transition model is introduced to guide the testing of DApps and two effective oracles are proposed for bug identification. The testing framework, DArcher, achieves high precision, recall, and accuracy in bug detection and has found and confirmed real bugs in popular DApps.

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21) (2021)

Proceedings Paper Computer Science, Software Engineering

A Comprehensive Study of Deep Learning Compiler Bugs

Qingchao Shen, Haoyang Ma, Junjie Chen, Yongqiang Tian, Shing-Chi Cheung, Xiang Chen

Summary: DL compilers are being used more and more to optimize code performance, but they can also introduce bugs that may cause unexpected model behavior. Research shows that around 20% of DL compiler bugs are related to types, leading to the development of new mutation operators and valuable guidelines for detection and debugging. This systematic study provides insights into the characteristics of DL compiler bugs and offers practical solutions for improving future work in this area.

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21) (2021)

Proceedings Paper Computer Science, Information Systems

ReDoSHunter: A Combined Static and Dynamic Approach for Regular Expression DoS Detection

Yeting Li, Zixuan Chen, Jialun Cao, Zhiwu Xu, Qiancheng Peng, Haiming Chen, Liyuan Chen, Shing-Chi Cheung

Summary: ReDoSHunter is a reliable framework for detecting ReDoS-vulnerable regexes, which can accurately pinpoint multiple vulnerabilities and generate attack-triggering string examples. The framework achieves 100% precision and recall on multiple large datasets, outperforming other techniques significantly.

PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM (2021)

Proceedings Paper Computer Science, Software Engineering

HERO: On the Chaos When PATH Meets Modules

Ying Wang, Liang Qiao, Chang Xu, Yepang Liu, Shing-Chi Cheung, Na Meng, Hai Yu, Zhiliang Zhu

Summary: The Go programming language (Golang) has been well received due to its library-based development support, but issues with dependency management arise from heterogeneous use of library-referencing modes. Through an empirical study and development of the HERO technique, detection and resolution of multiple dependency management issues have been achieved, improving software quality and stability.

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) (2021)

Proceedings Paper Computer Science, Software Engineering

TRANSREGEX: Multi-modal Regular Expression Synthesis by Generate-and-Repair

Yeting Li, Shuaimin Li, Zhiwu Xu, Jialun Cao, Zixuan Chen, Yun Hu, Haiming Chen, Shing-Chi Cheung

Summary: TRANSREGEX is a tool that automatically constructs regexes from natural language descriptions and examples, achieving higher accuracy than traditional NI.P-based methods and state-of-the-art multi-modal techniques. The evaluation results show that TRANSREGEX effectively utilizes natural language and examples.

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) (2021)

Review Computer Science, Software Engineering

A Multi-vocal Literature Review on challenges and critical success factors of phishing education, training and awareness

Orvila Sarker, Asangi Jayatilaka, Sherif Haggag, Chelsea Liu, M. Ali Babar

Summary: This study provides a comprehensive view of the challenges and critical success factors in the design, implementation, and evaluation stages of phishing education, training, and awareness (PETA). The findings highlight the need to address human-centric issues, bridge users' knowledge gaps, and adopt personalized approaches to enhance defense against phishing attacks.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Performability evaluation of NoSQL-based storage systems☆

Carlos Araujo, Meuse Oliveira Jr., Bruno Nogueira, Paulo Maciel, Eduardo Tavares

Summary: This paper proposes a method based on stochastic Petri nets for evaluating the consistency levels of storage systems based on NoSQL DBMS. The method takes into account different consistency levels and redundant nodes, and estimates the system's availability, throughput, and the probability of accessing the newest data. Experimental results demonstrate the practical feasibility of this approach.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Review Computer Science, Software Engineering

Monitoring tools for DevOps and microservices: A systematic grey literature review

L. Giamattei, A. Guerriero, R. Pietrantuono, S. Russo, I. Malavolta, T. Islam, M. Dinga, A. Koziolek, S. Singh, M. Armbruster, J. M. Gutierrez-Martinez, S. Caro-Alvaro, D. Rodriguez, S. Weber, J. Henss, E. Fernandez Vogelin, F. Simon Panojo

Summary: This article presents the results of a systematic study on the available monitoring tools for DevOps and microservices. It provides a classification and analysis of these tools, aiming to be a useful reference for researchers and practitioners in this field.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Harmonizing DevOps taxonomies - A grounded theory study

Jessica Diaz, Jorge Perez, Isaque Alves, Fabio Kon, Leonardo Leite, Paulo Meirelles, Carla Rocha

Summary: This paper presents empirical research on the structure of DevOps teams in software-producing organizations to better understand the organizational structure and characteristics of teams adopting DevOps. A theory of DevOps taxonomies is built through analysis, and its consistency with other taxonomies is tested.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Managing the changing understanding of benefits in software initiatives

Sinan Sigurd Tanilkan, Jo Erskine Hannay

Summary: When deciding to develop new software, it is important to have a clear understanding of the intended benefits. However, our research shows that stakeholders' understanding of benefits often fluctuates during the development process, leading to uncertainty. Therefore, we recommend focusing on helping practitioners embrace changes in their understanding of benefits.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Detecting security vulnerabilities with vulnerability nets

Pingyan Wang, Shaoying Liu, Ai Liu, Wen Jiang

Summary: This paper presents an approach that combines static analysis tools and manual audits to effectively detect various types of security vulnerabilities. By using a special Petri net representation, the proposed method is able to assist in the detection of taint-style vulnerabilities.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Early analysis of requirements using NLP and Petri-nets

Edgar Sarmiento-Calisaya, Julio Cesar Sampaio do Prado Leite

Summary: This research introduces an automated requirements analysis approach that combines natural language processing, Petri-nets, and visualization techniques to improve the quality of scenario-based specifications, identify defects, and anticipate inconsistencies.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Trace matrix optimization for fault localization

Jian Hu

Summary: This paper proposes a two-stage trace matrix optimization method for fault localization, which addresses the challenges of coincidental correctness and data imbalance in the current trace matrix. Through extensive experiments, significant improvements in fault localization effectiveness are demonstrated.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Hierarchical features extraction and data reorganization for code search

Fan Zhang, Manman Peng, Yuanyuan Shen, Qiang Wu

Summary: This study proposes a novel method called HFEDR that utilizes the hierarchical features of Transformer models and reorganizes training data to improve code search performance. Experimental results demonstrate the effectiveness and rationality of the proposed approach.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

EsArCost: Estimating repair costs of software architecture erosion using slice technology

Tong Wang, Bixin Li

Summary: Software architecture erosion has a negative impact on software quality, performance, and evolution cost. This paper proposes an approach called EsArCost to locate the causes of architecture erosion and estimate the repair cost of each erosion problem. Experimental results show that EsArCost can effectively and efficiently estimate repair costs.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

SYNTONY: Potential-aware fuzzing with particle swarm optimization

Xiajing Wang, Rui Ma, Wei Huo, Zheng Zhang, Jinyuan He, Chaonan Zhang, Donghai Tian

Summary: This paper proposes a new potential-aware fuzzing scheme called SYNTONY that measures seed potential using multiple objectives and prioritizes promising seeds to increase the number of unique crashes and coverage. Experimental results show that SYNTONY outperforms other fuzzing tools and has high compatibility and expansibility.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

An Empirical Investigation Into the Influence of Software Communities' Cultural and on

Stefano Lambiase, Gemma Catolino, Fabiano Pecorelli, Damian A. Tamburri, Fabio Palomba, Willem-Jan van den Heuvel, Filomena Ferrucci

Summary: This paper contributes to the existing body of knowledge on factors affecting productivity in software development by studying the cultural and geographical dispersion of a development community. The results show that cultural and geographical dispersion significantly impact productivity, suggesting that managers and practitioners should consider these aspects throughout the software development lifecycle.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

The effects of required security on software development effort

Elaine Venson, Bradford Clark, Barry Boehm

Summary: The software industry has been under pressure to adopt security practices and reduce software vulnerabilities. This study quantifies the effort required to develop secure software in increasing levels of rigor and scope and provides validated cost multipliers for practitioners to estimate proper resources for adopting security practices.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Towards an understanding of intra-defect associations: Implications for defect prediction

Yangyang Zhao, Mingyue Jiang, Yibiao Yang, Yuming Zhou, Hanjie Ma, Zuohua Ding

Summary: Previous studies have ignored the potential associations between modules involved in the same defect, and this comprehensive study explores the implications of intra-defect associations for defect prediction. The majority of defects occur across functions, with implicit dependencies between the modules. By considering intra-defect associations and merging modules, the proposed data processing approach significantly improves defect prediction performance.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)

Article Computer Science, Software Engineering

Learning to empathize with users through design thinking in hybrid mode: Insights from two educational case studies

Meira Levy, Irit Hadar

Summary: This research sheds new light on how students learn and practice hybrid work in educational settings through two educational studies. The findings show the benefits of new educational programs in fostering empathy and innovation among students, while also highlighting the challenges and opportunities in addressing real challenges.

JOURNAL OF SYSTEMS AND SOFTWARE (2024)