Article
Computer Science, Theory & Methods
Roni Mateless, Oren Tsur, Robert Moskovitch
Summary: This paper introduces a novel approach for software package authorship attribution called Pkg2Vec, based on a hierarchical deep neural network architecture, which better reflects real-world scenarios where code is organized in packages and written by teams. By utilizing a hierarchical neural network model and resilient features like keywords and API calls, Pkg2Vec outperforms other approaches in a large dataset of public packages.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
(2021)
Article
Computer Science, Information Systems
Mingi Cho, Hoyong Jin, Dohyeon An, Taekyoung Kwon
Summary: The study aims to assess kernel fuzzers' system call-related code coverage, finding that fuzzers achieving higher code coverage in traditional metrics do not execute more basic blocks related to system calls. It is recommended that kernel fuzzers use both system call-related functions and regular basic blocks in coverage metrics to evaluate fuzzing performance or enhance coverage feedback.
Article
Computer Science, Software Engineering
Leonardo Passos, Rodrigo Queiroz, Mukelabai Mukelabai, Thorsten Berger, Sven Apel, Krzysztof Czarnecki, Jesus Alejandro Padilla
Summary: Feature code is often scattered across a software system, which can be both beneficial and challenging. Research on the Linux kernel shows that the ratio of scattered features remains stable, with scattering commonly addressing performance-maintenance tradeoffs.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
(2021)
Article
Computer Science, Software Engineering
Thong Hoang, Julia Lawall, Yuan Tian, Richard J. Oentaryo, David Lo
Summary: The quality of stable versions relies on the initiative of kernel developers to propagate bug fixing patches. This study investigates the use of deep learning for a more accurate solution and proposes PatchNet, which outperforms various state-of-the-art baselines in experiments.
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
(2021)
Review
Computer Science, Information Systems
Eduardo Freitas, Assis T. de Oliveira Filho, Pedro R. X. do Carmo, Djamel Sadok, Judith Kelner
Summary: The path a packet takes in the Linux Kernel has been established for a long time, but with the introduction of new paradigms, complexity has increased. Fast Packet Processing Frameworks have emerged to solve the issues of low delay and high bandwidth services. However, each technology provides different methods and solutions, leading to different benefits and trade-offs. This work proposes a taxonomy to classify these solutions into hardware, software, and virtualization categories, and evaluates their applicability in real-world scenarios based on four criteria.
COMPUTER COMMUNICATIONS
(2022)
Article
Computer Science, Hardware & Architecture
Jinmeng Zhou, Tong Zhang, Wenbo Shen, Dongyoon Lee, Changhee Jung, Ahmed Azab, Ruowen Wang, Peng Ning, Kui Ren
Summary: This paper presents PeX, a static permission check error detector for Linux. PeX automatically identifies and reports any missing, inconsistent, and redundant permission checks by utilizing the novel and scalable KIRIN technique. It evaluates the latest stable Linux kernel and successfully identifies 45 new permission check errors, with 17 confirmed by kernel developers.
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING
(2023)
Article
Computer Science, Software Engineering
Jiahuei Lin, Haoxiang Zhang, Bram Adams, Ahmed E. Hassan
Summary: Vulnerabilities in software systems can result in financial losses, damage to reputation, and loss of trust. In open-source development, software providers rely on Linux distributions to distribute their software, making vulnerability management more challenging. This study investigates vulnerability management practices in Debian and Fedora, analyzing the lifecycle of vulnerabilities, common vulnerabilities across distributions, and the role of upstream projects in fixing vulnerabilities.
EMPIRICAL SOFTWARE ENGINEERING
(2023)
Article
Computer Science, Information Systems
Ahmed Abdelsalam, Pier Luigi Ventre, Carmine Scarpitta, Andrea Mayer, Stefano Salsano, Pablo Camarillo, Francois Clad, Clarence Filsfils
Summary: Segment Routing (SR) allows for loose source routing by including a list of instructions in the packet header. It has been implemented with both MPLS (SR-MPLS) and IPv6 (SRv6) data planes. SRv6 supports advanced services and can be used in various software and hardware forwarding engines.
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT
(2021)
Article
Computer Science, Artificial Intelligence
Yucong Chen, Xianzhi Tang, Shuaixin Xu, Fangfang Zhu, Qingguo Zhou, Tien-Hsiung Weng
Summary: Safety-critical systems integrating AI applications are often built on Linux, which offers a perfect software ecosystem. However, the non-deterministic execution path of system calls in Linux kernel space poses challenges for test coverage-based verification. This research analyzes the influence of system state on Linux kernel path variability and introduces a data collection system to uniquely identify system call execution paths. The evaluations show that the function execution paths of system calls relevant to file systems increase with system load but eventually stabilize, and vary across different file systems.
CONNECTION SCIENCE
(2023)
Article
Computer Science, Information Systems
Tae-Hee Yoo, Jussi Kivilinna, Choong-Hee Cho
Summary: This study focuses on implementing the ARIA algorithm in the Linux kernel to enhance its practicality and address the lack of ARIA-specific instructions in existing CPUs. The study shows that the accelerated version of ARIA performs up to 10.6 times better than the generic version, and the optimization of the affine transformation reduces the required cycle count by 32.2%.
Article
Computer Science, Hardware & Architecture
Brian Belleville, Wenbo Shen, Stijn Volckaert, Ahmed M. Azab, Michael Franz
Summary: KASLR randomizes the base addresses of the kernel's code and data segments to mitigate control-flow hijacking attacks, but relative addresses remain known to adversaries; KALD is a tool that finds direct disclosure vulnerabilities by statically analyzing the kernel source code and can detect previously unreported leaks.
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING
(2021)
Article
Computer Science, Software Engineering
Guoyun Duan, Yuanzhi Fu, Minjie Cai, Hao Chen, Jianhua Sun
Summary: This paper presents the first large-scale dataset specifically assembled for anomaly detection of the Linux kernel. The dataset contains 18,966 system call sequences labeled with normal and abnormal attributes, covering over 200 kernel versions and 3,600 bug-triggering programs in the past five years. The dataset enables superior generalization ability in training models and provides benchmark results for anomaly detection of Linux kernel.
JOURNAL OF SYSTEMS AND SOFTWARE
(2023)
Article
Mathematics, Interdisciplinary Applications
Linxiao Ma, Yuzhu Wang, Yue Wang, Ning Li, Sai-Fu Fung, Lu Zhang, Qian Zheng
Summary: In this study, the research hotspots in sports science were identified using bibliometric methods and the relationship between knowledge networks and scientific performance was explored using social network analysis. It was found that the hotspots covered various disciplines within sports science, with a particular need to accelerate research in sports human science. The study also showed that knowledge networks can be used to predict the scientific performance of knowledge elements.
Article
Chemistry, Multidisciplinary
Mei-Ling Chiang, Wei-Lun Su
Summary: This study focuses on improving inter-node load balancing for multithreaded applications, proposing a thread-aware selection policy that considers the distribution of threads on nodes for each thread group to achieve inter-node load balancing, and devising several enhancements to improve selection efficiency. The experimental results show a performance increase of 10.7% compared to the unmodified Linux kernel.
APPLIED SCIENCES-BASEL
(2021)
Article
Computer Science, Theory & Methods
Heyuan Shi, Guyu Wang, Ying Fu, Chao Hu, Houbing Song, Jian Dong, Kun Tang, Kai Liang
Summary: In this study, we propose abaci-finder, a deep-learning-based classification framework specifically designed for Linux kernel crashes. The framework utilizes stack trace preprocessing, a vectorization method called kstack2vec, and an attention-based BiLSTM neural network to accurately classify kernel crashes.
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
(2022)
Article
Computer Science, Software Engineering
Aline Brito, Marco Tulio Valente, Laerte Xavier, Andre Hora
EMPIRICAL SOFTWARE ENGINEERING
(2020)
Article
Computer Science, Software Engineering
Caroline Lima, Andre Hora
SOFTWARE QUALITY JOURNAL
(2020)
Article
Computer Science, Software Engineering
Andre Hora, Romain Robbes
EMPIRICAL SOFTWARE ENGINEERING
(2020)
Article
Computer Science, Software Engineering
Andre Hora
Summary: Developers often search for code examples on the web, manually created or automatically mined from code repositories. Current solutions for automatic mining of API usage examples still have limitations, such as poor quality and duplication. In this article, a new approach called APISonar is proposed to provide readable and reusable API examples. Evaluation shows that APISonar outperforms popular programming websites in terms of quality and has attracted a significant user base globally within a short period of time.
SOFTWARE-PRACTICE & EXPERIENCE
(2021)
Article
Computer Science, Software Engineering
Aline Brito, Andre Hora, Marco Tulio Valente
Summary: Refactoring is an essential activity in software evolution to improve source code maintainability and quality. The study of refactoring graphs provides quantitative and qualitative investigation into the size, commits, age, composition, ownership, operations, and patterns of refactorings. It can be used to improve code comprehension, detect refactoring patterns, and support software evolution studies.
EMPIRICAL SOFTWARE ENGINEERING
(2021)
Article
Computer Science, Software Engineering
Andre Hora
Summary: In this study, it was found that Google search engine tends to rank pages with multiple code examples higher. However, single code examples that are higher ranked are not necessarily more readable and reusable. Predicting top ranked examples, generic factors are more important than code quality factors.
JOURNAL OF SYSTEMS AND SOFTWARE
(2021)
Article
Computer Science, Software Engineering
Mateus Lopes, Andre Hora
Summary: As software systems become more complex and harder to maintain over time, it is important to understand the reasons behind the persistence of complex methods despite the known drawbacks. This paper provides a multi-language empirical study on the evolution of complex methods and developers' perceptions in JavaScript, Python, Java, C++, and C#. The study finds that programming language plays a significant role in code complexity, and developers' perception of complexity varies across languages. Additionally, the authors discuss insights for researchers and practitioners based on their findings.
EMPIRICAL SOFTWARE ENGINEERING
(2022)
Article
Computer Science, Software Engineering
Romulo Nascimento, Eduardo Figueiredo, Andre Hora
Summary: This article reports the results of a survey and mining study on JavaScript developers and projects, revealing several solutions for deprecating JavaScript APIs but no standard solution.
Article
Computer Science, Software Engineering
Gabriel Menezes, Bruno Cafeo, Andre Hora
Summary: This study analyzes the characteristics, maintenance, and usage of framework code samples in modern software systems. Most code samples are small and simple, providing a working environment for clients and relying on automated build tools. Clients commonly fork code samples, but rarely modify them.
JOURNAL OF SYSTEMS AND SOFTWARE
(2022)
Article
Computer Science, Software Engineering
Gabriel P. Oliveira, Ana Flavia C. Moura, Natercia A. Batista, Michele A. Brandao, Andre Hora, Mirella M. Moro
Summary: Assessing collaboration among GitHub developers through social networks, this study models three aspects: social collaboration, collaboration time in a repository, and technical features. The results indicate that the considered metrics are not correlated, providing new insights into collaboration. The information gathered is beneficial for social developer ranking.
SOFTWARE QUALITY JOURNAL
(2023)
Article
Computer Science, Software Engineering
Andre Hora
Summary: Test coverage measures the percentage of code covered by tests. Some code, such as non-runnable, debug-only, defensive, platform-specific, and conditional importing code, tends to be excluded from coverage analysis. Excluding code can decrease test coverage, but following code exclusion recommendations can improve coverage.
EMPIRICAL SOFTWARE ENGINEERING
(2023)
Proceedings Paper
Computer Science, Software Engineering
Romulo Nascimento, Andre Hora, Eduardo Figueiredo
Summary: This paper presents an empirical study on how API deprecation evolves in JavaScript, analyzing 1,918 releases of 50 popular packages. The results show that the majority of deprecated APIs have an increasing trend, and deprecation events usually occur in minor releases.
2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022)
(2022)
Proceedings Paper
Computer Science, Software Engineering
Livia Barbosa, Andre Hora
Summary: This paper presents the first empirical study on testing framework migration, specifically focusing on the migration from unittest to pytest in the Python ecosystem. The study analyzes the methods and reasons behind developers' migration to pytest, finding that Python projects are moving towards pytest but the migration process is not always straightforward. The study also reveals that the migrated test code is smaller than the original code, and developers migrate to pytest due to reasons such as easier syntax, interoperability, maintenance, and fixture flexibility/reuse, although concerns exist about pytest's implicit mechanics.
2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2022)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Victor Veloso, Andre Hora
Summary: This paper proposes an empirical study that assesses the quality of test methods using mutation testing at the method level. The study finds that there are no major differences between high-quality and low-quality test methods in terms of size, number of asserts, and modifications. However, high-quality test methods are less affected by critical test smells.
2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022)
(2022)
Proceedings Paper
Computer Science, Software Engineering
Gustavo Pereira, Andre Hora
2020 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2020)
(2020)
Article
Computer Science, Software Engineering
Amel Mammar, Meriem Belguidoum, Saddam Hocine Hiba
Summary: This paper introduces a formal EVENT-B-based approach for modeling and verifying the deployment of component-based applications. By gradually refining an abstract model, a precise specification is built, and mathematical reasoning is used to prove its correctness. The presented approach validates the deployment in a cloud environment using PROB and ensures the construction of a correct system that meets the constraints.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Shuqi Liu, Yu Zhou, Longbing Ji, Tingting Han, Taolue Chen
Summary: In this paper, we propose a framework that combines GUI events deduplication with an adaptive semantic matching strategy to enhance the usability of reused tests. Experimental evaluation demonstrates that the framework improves widget mapping performance, significantly reduces event redundancy, and reduces the manual effort of creating tests for similar applications.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Xiangyu Mu, Lei Liu, Peng Zhang, Jingyao Li, Hui Li
Summary: The aim of this study is to reduce the size of the test case set required to detect the commutativity problem of the reduce function. By determining the pattern of the function and selecting corresponding test cases, the proposed test case generation strategy can achieve the same accuracy with a smaller test case set. It has been shown to be effective and has a high recall rate.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Padmalata Nistala, Asha Rajbhoj, Vinay Kulkarni, Sapphire Noronha, Ankit Joshi
Summary: This paper presents an automated proposal development approach using a combination of model-based and AI-enabled techniques, and discusses the successful deployment and user feedback of the system.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Jacco O. G. Krijnen, Manuel M. T. Chakravarty, Gabriele Keller, Wouter Swierstra
Summary: Compiler correctness is a long-standing problem, and it becomes more significant with the rise of smart contracts on blockchains. A translation certification framework can address the trust issue for low-level code on the blockchain, allowing users to have confidence in the compilation process of smart contracts.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Phillip James, Faron Moller, Filippos Pantekis
Summary: OnTrack is a tool that supports railway verification workflows using model driven engineering frameworks, allowing railway engineers to interact with verification procedures through encapsulating formal methods.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Oleg Kiselyov
Summary: Heterogeneous metaprogramming systems leverage higher-level host languages to generate lower-level object language code, enabling faster production of high-performant code with correctness guarantees. This paper presents two systems with OCaml as the host language and C as the object language, discussing their implementation and applications.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Conor Reynolds, Rosemary Monahan
Summary: This paper provides a detailed approach to formalize a fragment of the theory of institutions in the Coq proof assistant. The approach is illustrated and evaluated by instantiating the framework with specific institution examples.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Herbert Rausch Fernandes, Giovanni Freitas Gomes, Antonio Carlos Pinheiro de Oliveira, Sergio Vale Aguiar Campos
Summary: Alzheimer's disease is a common form of dementia with no effective drug treatment available. In this study, a statistical model checking approach was used to analyze protein and drug interactions and evaluate the effects of different drugs on the components contributing to Alzheimer's disease. The results showed that rapamycin could slow down the biological process causing neuronal death, while LY294002 and NVP-BEZ235 may increase tau phosphorylation. These findings provide important insights for the scientific community and raise awareness about potential side effects of PI3K inhibitor drugs.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Erwan Mahe, Christophe Gaston, Pascale Le Gall
Summary: This paper presents an Interaction Language to encode Sequence Diagrams (SD) and associates it with three different formal semantics. This allows for direct formal verification of SD, while preserving traceability of SD concepts and executed actions, and addressing the translation of problematic operators.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Joan Giner-Miguelez, Abel Gomez, Jordi Cabot
Summary: Datasets are crucial for training and evaluating machine learning models, but they can also lead to undesirable behaviors like biased predictions. To tackle this issue, the machine learning community suggests adopting consistent guidelines for dataset descriptions. However, these guidelines rely on natural language descriptions, which hinder automated computation and analysis. To overcome this, we present DescribeML, a language engineering tool that provides precise, structured descriptions of machine learning datasets, including their composition, provenance, and social concerns.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Andrey Sadovykh, Bilal Said, Dragos Truscan, Hugo Bruneliere
Summary: In this paper, the authors report on their 7 years of practical experience with an iterative Model-based Requirements Engineering (MBRE) approach and language in five large European collaborative projects. They demonstrate through significant data sets that this model-based approach provides interesting benefits in terms of scalability, heterogeneity, adaptability, traceability, automation, consistency and quality, and usefulness or usability. Concrete examples from these projects are provided to illustrate the application of the MBRE approach and language, and the authors discuss the general benefits and limitations of using such an approach, as well as the lessons learned over the years.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Alfa Yohannis, Dimitris Kolovos, Antonio Garcia-Dominguez
Summary: Picto Web is a multi-tenant web-based tool that allows exploration of complex models by transforming them into various transient web-based views using rule-based transformations. It uses a lazy view computation approach to efficiently support large models and complex transformations, and includes monitoring and push notification facilities for automatic recomputation of views and updated delivery to clients.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Enes Yigitbas, Maximilian Schmidt, Antonio Bucchiarone, Sebastian Gottschalk, Gregor Engels
Summary: UML has become a popular modeling language used in computer science courses, and various interactive learning applications have been developed to improve student engagement and learning outcomes. However, these applications have not successfully created immersive environments for students. Therefore, this study introduces GaMoVR, a VR-based and gamified learning environment, which provides an interactive and fun learning experience for students learning about UML modeling.
SCIENCE OF COMPUTER PROGRAMMING
(2024)
Article
Computer Science, Software Engineering
Yaxin Zhao, Lina Gong, Wenhua Yang, Yu Zhou
Summary: Accessible design aims to enable as many people as possible to access software products and services. This study investigates the interaction between accessibility issues and other factors affecting software performance. By analyzing a large number of accessibility issues, the study reveals the characteristics of these issues and their relationship with software quality attributes.
SCIENCE OF COMPUTER PROGRAMMING
(2024)