4.1 Article

Automatic generation of valid and invalid test data for string validation routines using web searches and regular expressions

Journal

SCIENCE OF COMPUTER PROGRAMMING
Volume 97, Issue -, Pages 405-425

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.scico.2014.04.008

Keywords

Test data generation; Web searches; Regular expressions

Funding

  1. EPSRC project RE-COST (REducing the Cost of Oracles for Software Testing) [EP/I010386/1]
  2. EPSRC [EP/I010386/1] Funding Source: UKRI
  3. Engineering and Physical Sciences Research Council [EP/I010386/1] Funding Source: researchfish

Ask authors/readers for more resources

Classic approaches to automatic input data generation are usually driven by the goal of obtaining program coverage and the need to solve or find solutions to path constraints to achieve this. As inputs are generated with respect to the structure of the code, they can be ineffective, difficult for humans to read, and unsuitable for testing missing implementation. Furthermore, these approaches have known limitations when handling constraints that involve operations with string data types. This paper presents a novel approach for generating string test data for string validation routines, by harnessing the Internet. The technique uses program identifiers to construct web search queries for regular expressions that validate the format of a string type (such as an email address). It then performs further web searches for strings that match the regular expressions, producing examples of test cases that are both valid and realistic. Following this, our technique mutates the regular expressions to drive the search for invalid strings, and the production of test inputs that should be rejected by the validation routine. The paper presents the results of an empirical study evaluating our approach. The study was conducted on 24 string input validation routines collected from 10 open source projects. While dynamic symbolic execution and search-based testing approaches were only able to generate a very low number of values successfully, our approach generated values with an accuracy of 34% on average for the case of valid strings, and 99% on average for the case of invalid strings. Furthermore, whereas dynamic symbolic execution and search-based testing approaches were only capable of detecting faults in 8 routines, our approach detected faults in 17 out of the 19 validation routines known to contain implementation errors. (C) 2014 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.1
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Interdisciplinary Applications

Identifying safe intersection design through unsupervised feature extraction from satellite imagery

Jasper S. Wijnands, Haifeng Zhao, Kerry A. Nice, Jason Thompson, Katherine Scully, Jingqiu Guo, Mark Stevenson

Summary: This study systematically analyzed the design of all intersections in Australia and linked it to driving behaviors, identifying some relationships between intersection design and driving behavior and providing suggestions for safer driving.

COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING (2021)

Article Public, Environmental & Occupational Health

Using machine learning to examine associations between the built environment and physical function: A feasibility study

Jerome N. Rachele, Jingcheng Wang, Jasper S. Wijnands, Haifeng Zhao, Rebecca Bentley, Mark Stevenson

Summary: This study investigates using Generative Adversarial Networks (GANs) to measure neighbourhood design characteristics using street view and aerial imagery and explores the differences in urban greenery and dwelling structure between neighbourhoods with high and low physical function. The study highlights the importance of unique, diverse, and abundant imagery for successful applications of deep learning methods in this field.

HEALTH & PLACE (2021)

Article Environmental Sciences

Particulate Matter and Premature Mortality: A Bayesian Meta-Analysis

Nilakshi T. Waidyatillake, Patricia T. Campbell, Don Vicendese, Shyamali C. Dharmage, Ariadna Curto, Mark Stevenson

Summary: This study systematically reviewed the association between ambient particulate matter (PM) and premature mortality, and conducted a Bayesian hierarchical meta-analysis to assess this association. The results indicated a significant association between PM2.5 and premature mortality, while the results for PM10 were found to be unstable.

INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH (2021)

Article Geriatrics & Gerontology

The Impact of Cognition and Gender on Speeding Behaviour in Older Drivers with and without Suspected Mild Cognitive Impairment

Ying Ru Feng, Lynn Meuleners, Mark Stevenson, Jane Heyworth, Kevin Murray, Michelle Fraser, Sean Maher

Summary: The study found that older male drivers with suspected mild cognitive impairment had a significantly higher rate of speeding events, while there was no significant association for older female drivers. Most speeding events occurred in 60km/h and 70km/h speed zones.

CLINICAL INTERVENTIONS IN AGING (2021)

Article Ergonomics

The effect of telematic based feedback and financial incentives on driving behaviour: A randomised trial

Mark Stevenson, Anthony Harris, Jasper S. Wijnands, Duncan Mortimer

Summary: The study found that feedback alone may not be sufficient to motivate behavior change, but combining feedback with financial incentives can lead to significant reductions in risky driving behaviors.

ACCIDENT ANALYSIS AND PREVENTION (2021)

Article Public, Environmental & Occupational Health

Active transport research priorities for Australia

Ben Beck, Amelia Thorpe, Anna Timperio, Billie Giles-Corti, Carmel William, Evelyne de Leeuw, Hayley Christian, Kirstan Corben, Mark Stevenson, Melissa Backhouse, Rebecca Ivers, Rema Hayek, Rob Raven, Sam Bolton, Shanthi Ameratunga, Trevor Shilton, Belen Zapata-Diomedi

Summary: This study aims to develop a research priority agenda for active transport in Australia. Through a priority setting exercise, 50 research priority questions were identified, including supporting policy changes, overcoming community resistance, and improving transportation infrastructure. These research priorities will contribute to the advancement of active transport in Australia.

JOURNAL OF TRANSPORT & HEALTH (2022)

Article Public, Environmental & Occupational Health

Modelling SARS-CoV-2 disease progression in Australia and New Zealand: an account of an agent-based approach to support public health decision-making

Jason Thompson, Rod McClure, Tony Blakely, Nick Wilson, Michael G. Baker, Jasper S. Wijnands, Thiago Herick De Sa, Kerry Nice, Camilo Cruz, Mark Stevenson

Summary: The study developed a public health decision support model for mitigating the spread of SARS-CoV-2 infections in Australia and New Zealand. Results indicated that sustained public adherence to social restrictions could eliminate community transmission, but a second wave of infections may occur if adherence decreases.

AUSTRALIAN AND NEW ZEALAND JOURNAL OF PUBLIC HEALTH (2022)

Article Environmental Sciences

The impact of the COVID-19 pandemic on air pollution: A global assessment using machine learning techniques

Jasper S. Wijnands, Kerry A. Nice, Sachith Seneviratne, Jason Thompson, Mark Stevenson

Summary: In response to the COVID-19 pandemic, countries implemented public health ordinances that resulted in restricted mobility and changed air quality. This study aimed to quantify the impact of carbon-based transport and industrial activity on air quality. Through city-level modeling and pollutant-specific models, the study found reductions in NO2 and PM2.5 pollution, especially in China, Europe, and India. The study also observed a subsequent reduction in O-3 levels below what was expected based on meteorological conditions during summer months. These findings are important for developing effective strategies to improve health outcomes.

ATMOSPHERIC POLLUTION RESEARCH (2022)

Editorial Material Public, Environmental & Occupational Health

Creating healthy and sustainable cities: what gets measured, gets done

Billie Giles-Corti, Anne Vernez Moudon, Melanie Lowe, Deepti Adlakha, Ester Cerin, Geoff Boeing, Carl Higgs, Jonathan Arundel, Shiqin Liu, Erica Hinckson, Deborah Salvo, Marc A. Adams, Hannah Badland, Alex A. Florindo, Klaus Gebel, Ruth F. Hunter, Josef Mitas, Adewale L. Oyeyemi, Anna Puig-Ribera, Ana Queralt, Maria Paula Santos, Jasper Schipperijn, Mark Stevenson, Delfien Van Dyck, Guillem Vich, James F. Sallis

LANCET GLOBAL HEALTH (2022)

Article Environmental Studies

Developing urban biking typologies: Quantifying the complex interactions of bicycle ridership, bicycle network and built environment characteristics

Ben Beck, Meghan Winters, Trisalyn Nelson, Chris Pettit, Simone Z. Leao, Meead Saberi, Jason Thompson, Sachith Seneviratne, Kerry Nice, Mark Stevenson

Summary: This study developed a novel urban biking typology using unsupervised machine learning methods and analyzed biking patterns in Greater Melbourne region, Australia. The findings revealed 5 clusters and highlighted areas with unique characteristics.

ENVIRONMENT AND PLANNING B-URBAN ANALYTICS AND CITY SCIENCE (2023)

Article Environmental Sciences

Extreme environmental temperatures and motorcycle crashes: a time-series analysis

Mohammad Javad Zare Sakhvidi, Jun Yang, Danial Mohammadi, Hussein FallahZadeh, Amirhooshang Mehrparvar, Mark Stevenson, Xavier Basagana, Antonio Gasparrini, Payam Dadvand

Summary: Extreme temperatures can affect the risk of traffic crashes, particularly motorcycle crashes. Exposure to extremely cold and hot temperatures increases the risk of seeking medical attention for motorcycle crashes, especially within 0 to 3 days after exposure. The study estimates that approximately 11.01% of motorcycle crash medical attendances are attributable to non-optimal temperatures.

ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH (2022)

Article Veterinary Sciences

Clinical investigation and management of Brucella suis seropositive dogs: A longitudinal case series

Catherine C. Kneipp, Ania T. Deutscher, Ronald Coilparampil, Anne Marie Rose, Jennifer Robson, Richard Malik, Mark A. Stevenson, Anke K. Wiethoelter, Siobhan M. Mor

Summary: This study aimed to investigate the clinical characteristics, serology, microbiology, and clinical response to treatment in B. suis-seropositive dogs. The results showed that most dogs with B. suis infections had subclinical infections, and serology was poorly associated with clinical disease. Antibiotic treatment is recommended for clinical management.

JOURNAL OF VETERINARY INTERNAL MEDICINE (2023)

Article Transportation

Assessing injury severity of secondary incidents using support vector machines

Jing Li, Jingqiu Guo, Jasper S. Wijnands, Rongjie Yu, Chengcheng Xu, Mark Stevenson

Summary: Compared to normal incidents, secondary incidents are more likely to result in severe injuries and fatalities. Limited efforts have been made to unveil the factors affecting the severity of secondary incidents. This study collected incidents data from Interstate-5 in California within five years and used Random Forest-based and Support Vector Machine models to investigate the contributing factors. The results showed that occupancy, duration, frequency of lanes changes, and number of lanes were found to contribute to injury severity of secondary incidents.

JOURNAL OF TRANSPORTATION SAFETY & SECURITY (2022)

Article Environmental Sciences

Evaluation of Urban Design Qualities across Five Urban Typologies in Hanoi

Thanh Phuong Ho, Mark Stevenson, Jason Thompson, Tuan Quoc Nguyen

Summary: The study found variations in microscale urban design qualities across different urban typologies, with old and high-density urban areas showing higher design qualities. Compared to Western cities, Hanoi's urban design characteristics, particularly in terms of imageability and complexity, are substantially different.

URBAN SCIENCE (2021)

Article Ergonomics

Driving exposure, patterns and safety critical events for older drivers with and without mild cognitive impairment: Findings from a naturalistic driving study

Ying Ru Feng, Lynn Meuleners, Mark Stevenson, Jane Heyworth, Kevin Murray, Michelle Fraser, Sean Maher

Summary: The study aimed to compare driving exposure, patterns, and safety critical events between drivers with Mild Cognitive Impairment (MCI) and a comparison group without cognitive impairment. The results showed that there were no significant differences in the driving exposure, patterns, or safety critical events between the two groups, with only binocular contrast sensitivity being associated with the rate of safety critical events.

ACCIDENT ANALYSIS AND PREVENTION (2021)

Article Computer Science, Software Engineering

A formal approach for the correct deployment of cloud applications

Amel Mammar, Meriem Belguidoum, Saddam Hocine Hiba

Summary: This paper introduces a formal EVENT-B-based approach for modeling and verifying the deployment of component-based applications. By gradually refining an abstract model, a precise specification is built, and mathematical reasoning is used to prove its correctness. The presented approach validates the deployment in a cloud environment using PROB and ensures the construction of a correct system that meets the constraints.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

Enhancing test reuse with GUI events deduplication and adaptive semantic matching

Shuqi Liu, Yu Zhou, Longbing Ji, Tingting Han, Taolue Chen

Summary: In this paper, we propose a framework that combines GUI events deduplication with an adaptive semantic matching strategy to enhance the usability of reused tests. Experimental evaluation demonstrates that the framework improves widget mapping performance, significantly reduces event redundancy, and reduces the manual effort of creating tests for similar applications.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

A method of test case set generation in the commutativity test of reduce functions

Xiangyu Mu, Lei Liu, Peng Zhang, Jingyao Li, Hui Li

Summary: The aim of this study is to reduce the size of the test case set required to detect the commutativity problem of the reduce function. By determining the pattern of the function and selecting corresponding test cases, the proposed test case generation strategy can achieve the same accuracy with a smaller test case set. It has been shown to be effective and has a high recall rate.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

An industrial experience report on model-based, AI-enabled proposal development for an RFP/RFI

Padmalata Nistala, Asha Rajbhoj, Vinay Kulkarni, Sapphire Noronha, Ankit Joshi

Summary: This paper presents an automated proposal development approach using a combination of model-based and AI-enabled techniques, and discusses the successful deployment and user feedback of the system.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

Translation certification for smart contracts

Jacco O. G. Krijnen, Manuel M. T. Chakravarty, Gabriele Keller, Wouter Swierstra

Summary: Compiler correctness is a long-standing problem, and it becomes more significant with the rise of smart contracts on blockchains. A translation certification framework can address the trust issue for low-level code on the blockchain, allowing users to have confidence in the compilation process of smart contracts.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

OnTrack: Reflecting on domain specific formal methods for railway designs

Phillip James, Faron Moller, Filippos Pantekis

Summary: OnTrack is a tool that supports railway verification workflows using model driven engineering frameworks, allowing railway engineers to interact with verification procedures through encapsulating formal methods.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

Generating C: Heterogeneous metaprogramming system description

Oleg Kiselyov

Summary: Heterogeneous metaprogramming systems leverage higher-level host languages to generate lower-level object language code, enabling faster production of high-performant code with correctness guarantees. This paper presents two systems with OCaml as the host language and C as the object language, discussing their implementation and applications.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

Reasoning about logical systems in the Coq proof assistant

Conor Reynolds, Rosemary Monahan

Summary: This paper provides a detailed approach to formalize a fragment of the theory of institutions in the Coq proof assistant. The approach is illustrated and evaluated by instantiating the framework with specific institution examples.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

Stochastic formal model of PI3K/mTOR pathway in Alzheimer's disease for drug repurposing: An evaluation of rapamycin, LY294002, and NVP-BEZ235

Herbert Rausch Fernandes, Giovanni Freitas Gomes, Antonio Carlos Pinheiro de Oliveira, Sergio Vale Aguiar Campos

Summary: Alzheimer's disease is a common form of dementia with no effective drug treatment available. In this study, a statistical model checking approach was used to analyze protein and drug interactions and evaluate the effects of different drugs on the components contributing to Alzheimer's disease. The results showed that rapamycin could slow down the biological process causing neuronal death, while LY294002 and NVP-BEZ235 may increase tau phosphorylation. These findings provide important insights for the scientific community and raise awareness about potential side effects of PI3K inhibitor drugs.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

Denotational and operational semantics for interaction languages: Application to trace analysis

Erwan Mahe, Christophe Gaston, Pascale Le Gall

Summary: This paper presents an Interaction Language to encode Sequence Diagrams (SD) and associates it with three different formal semantics. This allows for direct formal verification of SD, while preserving traceability of SD concepts and executed actions, and addressing the translation of problematic operators.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

DescribeML: A dataset description tool for machine learning

Joan Giner-Miguelez, Abel Gomez, Jordi Cabot

Summary: Datasets are crucial for training and evaluating machine learning models, but they can also lead to undesirable behaviors like biased predictions. To tackle this issue, the machine learning community suggests adopting consistent guidelines for dataset descriptions. However, these guidelines rely on natural language descriptions, which hinder automated computation and analysis. To overcome this, we present DescribeML, a language engineering tool that provides precise, structured descriptions of machine learning datasets, including their composition, provenance, and social concerns.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

An iterative approach for model-based requirements engineering in large collaborative projects: A detailed experience report

Andrey Sadovykh, Bilal Said, Dragos Truscan, Hugo Bruneliere

Summary: In this paper, the authors report on their 7 years of practical experience with an iterative Model-based Requirements Engineering (MBRE) approach and language in five large European collaborative projects. They demonstrate through significant data sets that this model-based approach provides interesting benefits in terms of scalability, heterogeneity, adaptability, traceability, automation, consistency and quality, and usefulness or usability. Concrete examples from these projects are provided to illustrate the application of the MBRE approach and language, and the authors discuss the general benefits and limitations of using such an approach, as well as the lessons learned over the years.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

Exploring complex models with picto web

Alfa Yohannis, Dimitris Kolovos, Antonio Garcia-Dominguez

Summary: Picto Web is a multi-tenant web-based tool that allows exploration of complex models by transforming them into various transient web-based views using rule-based transformations. It uses a lazy view computation approach to efficiently support large models and complex transformations, and includes monitoring and push notification facilities for automatic recomputation of views and updated delivery to clients.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

GaMoVR: Gamification-based UML learning environment in virtual reality

Enes Yigitbas, Maximilian Schmidt, Antonio Bucchiarone, Sebastian Gottschalk, Gregor Engels

Summary: UML has become a popular modeling language used in computer science courses, and various interactive learning applications have been developed to improve student engagement and learning outcomes. However, these applications have not successfully created immersive environments for students. Therefore, this study introduces GaMoVR, a VR-based and gamified learning environment, which provides an interactive and fun learning experience for students learning about UML modeling.

SCIENCE OF COMPUTER PROGRAMMING (2024)

Article Computer Science, Software Engineering

How accessibility affects other quality attributes of software? A case study of GitHub

Yaxin Zhao, Lina Gong, Wenhua Yang, Yu Zhou

Summary: Accessible design aims to enable as many people as possible to access software products and services. This study investigates the interaction between accessibility issues and other factors affecting software performance. By analyzing a large number of accessibility issues, the study reveals the characteristics of these issues and their relationship with software quality attributes.

SCIENCE OF COMPUTER PROGRAMMING (2024)