Article
Environmental Sciences
Paraskevi Nomikou, Paraskevi N. Polymenakou, Andrea Luca Rizzo, Sven Petersen, Mark Hannington, Stephanos Pantelis Kilias, Dimitris Papanikolaou, Javier Escartin, Konstantinos Karantzalos, Theodoros J. Mertzimekis, Varvara Antoniou, Mel Krokos, Lazaros Grammatikopoulos, Francesco Italiano, Cinzia Giuseppina Caruso, Gianluca Lazzaro, Manfredi Longo, Sergio Scire Scappuzzo, Walter D'Alessandro, Fausto Grassa, Konstantina Bejelou, Danai Lampridou, Anna Katsigera, Anne Dura
Summary: Submarine hydrothermal systems along volcanic ridges and arcs are dynamic and pose risks to the environment and society. Continuous monitoring with multidisciplinary instrumentation is necessary for better risk assessment and early warning.
FRONTIERS IN MARINE SCIENCE
(2022)
Article
Chemistry, Analytical
Emma Farago, Adrian D. C. Chan
Summary: This paper proposes an interpolation-based method for the detection and reconstruction of poor-quality channels in high-density electromyography (HD-EMG) arrays. The proposed method outperforms other rule-based methods in terms of precision and recall, and it can successfully reconstruct the poor-quality channels.
Article
Engineering, Marine
Jingyang Qiao, Wu Liu, Jingquan Liu, Jianping Zhou
Summary: A chained data sampling and transmission system based on Zynq-7000 Soc and clock synchronization has been designed and built, achieving high-precision submarine data sampling and stable, reliable high-speed data transmission.
JOURNAL OF MARINE SCIENCE AND ENGINEERING
(2021)
Article
Engineering, Electrical & Electronic
Osman Salem, Khalid Alsubhi, Ahmed Mehaoua, Raouf Boutaba
Summary: The use of non-invasive sensors for monitoring physiological attributes, transmitted to a centralized processing unit for detecting health changes, proposed a change point detection model using a Markov chain, effectively distinguishing measurement faults from health emergencies.
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS
(2021)
Article
Engineering, Marine
Jie Chen, Hailin Liu, Bin Lv, Chao Liu, Xiaonan Zhang, Hui Li, Lin Cao, Junhe Wan
Summary: This paper introduces an extensible remote monitoring system for a seafloor observatory network in Laizhou Bay, which achieves long-term, continuous and online monitoring for a marine ranching environment. Through the control model, standardized communication protocol, and dynamic management method, the system can process a large number of devices' data and control them. An improved data quality control method is proposed to reduce data error rate. Experimental results show that the monitoring system performs well and the proposed algorithms can be applied to other similar systems with adaptive requirements.
JOURNAL OF MARINE SCIENCE AND ENGINEERING
(2022)
Article
Environmental Sciences
Mafalda Marques Carapuco, Tanya Mendes Silveira, Zuzia Stroynowski, Jorge Miguel Miranda
Summary: This article introduces the EMSO-PT initiative, which aims to establish a network of multidisciplinary underwater observatories in the Atlantic, along with related laboratories and data processing support infrastructures. The priority of this initiative is to generate continuous scientific data on marine environmental processes and develop new sensors and platforms. Data will be disseminated through EMSO-ERIC channels, enabling integration and open access.
FRONTIERS IN MARINE SCIENCE
(2022)
Article
Food Science & Technology
Carolin Loerchner, Martin Horn, Felix Berger, Carsten Fauhl-Hassek, Marcus A. Glomb, Susanne Esslinger
Summary: In this study, a workflow based on the evaluation of a quality control sample in non-targeted analysis for outlier detection and time related trend is proposed for the first time. The novel concept was tested using Fourier transform-midinfrared spectroscopy with rapeseed oil as the quality control sample, achieving best results with outlier score-based methods. Different data evaluation strategies were compared and the models were challenged by varying conditions to verify their applicability in identifying outliers.
Article
Engineering, Ocean
Bin Lv, Jie Chen, Hai-lin Liu, Chao Liu, Zhao-wen Zhang, Xiao-nan Zhang, Hao Gao, Yu-long Cai
Summary: This paper presents the design of a deep-sea chemical data collector for a seafloor observatory network, outlining its control system, functions, and sensors. A sea trial in Jiaozhou Bay demonstrated the reliability of the system and data quality for all chemical parameters.
MARINE GEORESOURCES & GEOTECHNOLOGY
(2022)
Article
Engineering, Civil
Marcela A. Meira, Emerson S. Freitas, Victor Hugo R. Coelho, Javier Tomasella, Hayley J. Fowler, Geraldo M. Ramos Filho, Abner L. Silva, Cristiano das N. Almeida
Summary: This study presents a new automatic quality control procedure for sub-hourly rainfall data in Brazil, which effectively identifies faulty rain gauges and provides a high-quality dataset. The method plays an important role in disaster assessment.
JOURNAL OF HYDROLOGY
(2022)
Article
Multidisciplinary Sciences
Keisuke Yoshihara, Kei Takahashi
Summary: A simple anomaly detection method for unlabeled time series data is proposed, using log-likelihood ratio estimation and density ratio estimation. The study suggests the importance of incorporating specific information into the model for time series anomaly detection.
Article
Computer Science, Information Systems
Yang Ma, Xujun Zhao, Chaowei Zhang, Jifu Zhang, Xiao Qin
Summary: The study proposes multi-source outlier detection techniques to reliably identify outliers in multiple datasets based on the unique characteristics of multi-source outliers; attempts to classify multi-source outliers into three types and designs multiple algorithms to improve the efficiency and accuracy of outlier detection.
INFORMATION SCIENCES
(2021)
Article
Physics, Multidisciplinary
Michiel Nijhuis, Iman van Lelyveld
Summary: Outliers are commonly found in data, and various algorithms exist to detect them. The verification of these outliers can determine whether they are data errors or not. However, this verification process is time-consuming and the underlying issues leading to the data error can change over time. Therefore, using reinforcement learning on a statistical outlier detection approach can optimize the detection process by adjusting the coefficients of the ensemble model with every new piece of data.
Article
Energy & Fuels
Gustavo Felipe Martin Nascimento, Frederic Wurtz, Patrick Kuo-Peng, Benoit Delinchant, Nelson Jhoe Batistela
Summary: Buildings are crucial in energy transition, with 67.8% of electricity consumption in France attributed to buildings in 2017. Detecting anomalies in power consumption data is essential for identifying energy-saving opportunities and metering system malfunctions. The study shows that using a combination of regression method like random forest and adjusted boxplot outlier detection method is promising for detecting data quality issues in electricity consumption.
Article
Health Care Sciences & Services
Laura Viviani, Ian R. White, Elizabeth J. Williamson, James Carpenter, Jan van der Meulen, David A. Cromwell
Summary: This study evaluated the performance of the DetectDeviatingCells (DDC) algorithm in detecting data anomalies at the observation and variable level in continuous variables. The DDC algorithm showed promising results in improving error detection processes for observational data, particularly in detecting complex error patterns.
JOURNAL OF CLINICAL EPIDEMIOLOGY
(2023)
Article
Environmental Sciences
Jinhai Yu, Bang An, Huan Xu, Zhongmiao Sun, Yuwei Tian, Qiuyu Wang
Summary: This study established analytical observation equations between gravity anomaly and topography and obtained the corresponding iterative solving method based on the least square method after linearizing the equations. The regularization method and piecewise bilinear interpolation function were introduced into the observation equations to effectively suppress the high-frequency effect of the boundary sea region and the low-frequency effect of the far sea region. Finally, the seafloor topography beneath a sea region in the South China Sea was predicted as an actual application, and the prediction results were compared with ship soundings data, showing a root-mean-square (RMS) error of 127.4 m and a relative error of approximately 3.4%.
Article
Computer Science, Interdisciplinary Applications
Ali Darvishi, Hassan Khosravi, Shazia Sadiq, Barbara Weber
Summary: The literature review examines the use of neurophysiological measurements in higher education, finding that electroencephalography and facial expression recognition are the dominant measurement types used, experiments mainly utilize pre-experimental designs, and the focus is on the impact of attention and emotion on learning outcomes.
INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION
(2022)
Article
Computer Science, Artificial Intelligence
Shaochen Yu, Tianwa Chen, Lei Han, Gianluca Demartini, Shazia Sadiq
Summary: Data preparation is a labor-intensive step in data analytics, and manual effort from experts is still required. This paper focuses on data quality discovery and introduces DataOps-4G, a platform that allows users to interact with data without coding. A user study evaluates the effectiveness and efficiency of the platform, showing the potential for non-experts to perform data quality discovery tasks.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Article
Computer Science, Information Systems
Wei Wang, Tianwa Chen, Marta Indulska, Shazia Sadiq, Barbara Weber
Summary: In this study, an experiment was conducted to investigate whether rule linking can improve understanding performance. The results show that rule linking outperforms separated modeling in terms of understanding effectiveness, efficiency, perceived mental effort, and visual attention. Further analysis reveals that rule linking decreases the occurrence of rule scanning and screening processes, leading to an increase in visual association and improved task performance.
INFORMATION SYSTEMS
(2022)
Article
Computer Science, Hardware & Architecture
Shazia Sadiq, Amir Aryani, Gianluca Demartini, Wen Hua, Marta Indulska, Andrew Burton-Jones, Hassan Khosravi, Diana Benavides-Prado, Timos Sellis, Ida Someh, Rhema Vaithianathan, Sen Wang, Xiaofang Zhou
Summary: The demand for effective use of information assets is increasing in both public and private sector organizations. However, there are complex socio-technical challenges in balancing regulatory compliance and data privacy, social expectations and ethical use, business process agility and value creation, and scarcity of data science talent. This paper presents a series of case studies to highlight these challenges and introduces Information Resilience as a framework for responsible and agile information use. It aims to develop a manifesto for Information Resilience to guide future research and development in responsible data management.
Article
Education & Educational Research
Ali Darvishi, Hassan Khosravi, Shazia Sadiq, Dragan Gasevic
Summary: This paper presents AI-assisted and analytical approaches to address common concerns in peer assessment systems and increase their trustworthiness.
BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY
(2022)
Article
Computer Science, Information Systems
Lei Han, Tianwa Chen, Gianluca Demartini, Marta Indulska, Shazia Sadiq
Summary: Understanding data worker behaviors during data preparation is crucial for designing systems that support their exploration of datasets. However, research on data workers' strategies in data preparation activities is lacking. In this study, we investigate the behavior of data workers in discovering data quality issues and explore factors that affect their behaviors and performance. Our experiment using eye-tracking technology reveals strategies, proficiency in coding, and importance of external resource search. We also propose a systematic approach to improve data curation processes through collective intelligence.
ACM TRANSACTIONS ON INFORMATION SYSTEMS
(2023)
Article
Computer Science, Interdisciplinary Applications
Iris Beerepoot, Claudio Di Ciccio, Hajo A. Reijers, Stefanie Rinderle-Ma, Wasana Bandara, Andrea Burattin, Diego Calvanese, Tianwa Chen, Izack Cohen, Benoit Depaire, Gemma Di Federico, Marlon Dumas, Christopher van Dun, Tobias Fehrer, Dominik A. Fischer, Avigdor Gal, Marta Indulska, Vatche Isahagian, Christopher Klinkmueller, Wolfgang Kratsch, Henrik Leopold, Amy Van Looy, Hugo Lopez, Sanja Lukumbuzya, Jan Mendling, Lara Meyers, Linda Moder, Marco Montali, Vinod Muthusamy, Manfred Reichert, Yara Rizk, Michael Rosemann, Maximilian Roeglinger, Shazia Sadiq, Ronny Seiger, Tijs Slaats, Mantas Simkus, Ida Asadi Someh, Barbara Weber, Ingo Weber, Mathias Weske, Francesca Zerbato
Summary: This paper provides an overview of the major research problems in the field of Business Process Management. These challenges have been identified through an open call to the community, discussed and refined in a workshop, and described in detail in this paper with motivations for further investigation. This overview aims to inspire both novice and advanced scholars interested in innovative ideas for analyzing, designing, and managing work processes using information technology.
COMPUTERS IN INDUSTRY
(2023)
Article
Computer Science, Information Systems
Jiechen Xu, Lei Han, Shazia Sadiq, Gianluca Demartini
Summary: Collecting relevance judgments from human assessors is crucial to evaluate the effectiveness of Information Retrieval (IR) systems. Crowdsourcing has been successfully used to scale up the collection of manual relevance judgments, and previous studies have explored the impact of different judgment task design elements on judgment quality and efficiency. This research investigates the positive and negative effects of providing crowd assessors with additional metadata beyond the topic and document to be judged. It examines the impact of human and machine metadata on judgment quality, cost, and the influence of metadata quality on the collected judgments.
INFORMATION PROCESSING & MANAGEMENT
(2023)
Article
Computer Science, Interdisciplinary Applications
Ali Darvishi, Hassan Khosravi, Afshin Rahimi, Shazia Sadiq, Dragan Gasevic
Summary: Engaging students in creating learning resources has pedagogical benefits, but a selection process is needed to separate high-quality from low-quality student-generated content (SGC). Peer-review is commonly used, but it introduces the challenge of achieving consensus among multiple reviewers. In this study, 18 inference models were investigated for inferring the quality of SGC, with the findings suggesting the need for advanced probabilistic and text analysis methods, as well as instructor oversight and training of students for reliable reviews.
IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES
(2023)
Proceedings Paper
Computer Science, Information Systems
Gianluca Demartini, Jie Yang, Shazia Sadiq
Summary: Data quality has recently received attention due to the proliferation of data analytics and machine learning applications, and its success relies on both the quantity and quality of data. Data curation, which includes activities like annotation, cleaning, and integration, is crucial in ensuring the quality of analytics results. Mishandling data challenges can have negative effects, particularly in critical domains like healthcare and finance.
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Hui Zhou, Lei Han, Gianluca Dermatini, Marta Indulska, Shazia Sadiq
Summary: Existing approaches for evaluating data quality are not applicable to new, unfamiliar and repurposed datasets, where users need to evaluate the quality of such data despite the lack of involvement in the data collection process. This paper investigates the role of metadata in evaluating the quality of repurposed datasets, gathering user behavior data through a lab experiment to explore when, how and why users use metadata in such tasks. The results highlight the critical role of metadata in evaluating repurposed data and provide insights into metadata usage patterns.
CONCEPTUAL MODELING (ER 2022)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Solmaz Abdi, Hassan Khosravi, Shazia Sadiq, Ali Darvishi
Summary: In recent years, there has been an increasing trend in using student-centred approaches within educational systems, engaging students in various higher-order learning activities. This paper proposes an interpretable learner model called MA-Elo, which captures a student's knowledge state based on their engagement with multiple types of learning activities. Results show that MA-Elo outperforms baseline and some state-of-the-art learner models in predictive performance.
ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2021), PT II
(2021)
Proceedings Paper
Computer Science, Interdisciplinary Applications
Hassan Khosravi, Gianluca Demartini, Shazia Sadiq, Dragan Gasevic
Summary: Learnersourcing is an effective learner-centered approach for harnessing students' creativity and evaluation power in education. This paper presents lessons learned from the development and deployment of a learnersourcing system, highlighting best practices for assessing student contributions, incentivizing high-quality work, and providing actionable insights for instructors to guide student learning. These findings contribute to the growing literature on effective learnersourcing systems and technological educational solutions for learner-centered learning at scale.
LAK21 CONFERENCE PROCEEDINGS: THE ELEVENTH INTERNATIONAL CONFERENCE ON LEARNING ANALYTICS & KNOWLEDGE
(2021)
Article
Education & Educational Research
Hassan Khosravi, Shiva Shabaninejad, Aneesha Bakharia, Shazia Sadiq, Marta Indulska, Dragan Gasevic
Summary: This paper introduces a human-in-the-loop AI approach to assist educators in conducting more comprehensive analysis of student data, aiming to identify and take appropriate intervention measures for subpopulations with deviations in performance or learning process.
JOURNAL OF LEARNING ANALYTICS
(2021)
Article
Computer Science, Interdisciplinary Applications
Solmaz Abdi, Hassan Khosravi, Shazia Sadiq, Gianluca Demartini
Summary: Learnersourcing is being considered as an alternative method for evaluating the quality of learning resources. Research shows that students' ratings strongly correlate with those of experts, and a consensus approach based on matrix factorization can improve the accuracy of aggregating learnersourced decisions. By incorporating information on student performance and domain experts' ratings, the accuracy of results can be further enhanced.
IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES
(2021)