4.7 Article

Advanced turbidity prediction for operational water supply planning

Journal

DECISION SUPPORT SYSTEMS
Volume 119, Issue -, Pages 72-84

Publisher

ELSEVIER
DOI: 10.1016/j.dss.2019.02.009

Keywords

Analytics; Water quality; Turbidity prediction

Funding

  1. Economic and Social Research Council [ES/P000673/1]
  2. Alan Turing Institute under the EPSRC [EP/N510129/1]
  3. ESRC [1992301] Funding Source: UKRI

Ask authors/readers for more resources

Turbidity is an optical quality of water caused by suspended solids that give the appearance of 'cloudiness'. While turbidity itself does not directly present a hazard to human health, it can be an indication of poor water quality and mask the presence of parasites such as Cryptosporidium. It is, therefore, a recommendation of the World Health Organisation (WHO) that turbidity should not exceed a level of 1 Nephelometric Turbidity Unit (NTU) before chlorination. For a drinking water supplier, turbidity peaks can be highly disruptive requiring the temporary shutdown of a water treatment works. Such events must be carefully managed to ensure continued supply; to recover the supply deficit, water stores must be depleted or alternative works utilised. Machine learning techniques have been shown to be effective for the modelling of complex environmental systems, often used to help shape environmental policy. We contribute to the literature by adopting such techniques for operational purposes, developing a decision support tool that predicts > 1 NTU turbidity events up to seven days in advance allowing water supply managers to make proactive interventions. We apply a Generalised Linear Model (GLM) and a Random Forest (RF) model for the prediction of > 1 NTU events. AUROC scores of over 0.80 at five of six sites suggest that machine learning techniques are suitable for predicting turbidity peaking events. Furthermore, we find that the RF model can provide a modest performance boost due to its stronger capacity to capture nonlinear interactions in the data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Management

The value of text for small business default prediction: A Deep Learning approach

Matthew Stevenson, Christophe Mues, Cristian Bravo

Summary: Compared to consumer lending, mSME credit risk modeling is more challenging due to limited data availability, with textual loan assessment being a standard practice. Deep Learning and NLP techniques, including the BERT model, are used to extract information from textual assessments, showing surprisingly effective prediction of default. However, combining text with traditional data does not enhance predictive capability, with performance varying based on text length. Our proposed Deep Learning model is robust to text quality and can partly automate the mSME lending process.

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH (2021)

Article Computer Science, Artificial Intelligence

Super-app behavioral patterns in credit risk models: Financial, statistical and regulatory implications

Luisa Roa, Alejandro Correa-Bahnsen, Gabriel Suarez, Fernando Cortes-Tejada, Maria A. Luque, Cristian Bravo

Summary: This paper investigates the impact of alternative data from an app-based marketplace on credit scoring models, revealing that these new data sources are particularly effective for predicting financial behavior in low-wealth and young individuals. Additionally, the study shows interesting non-linear trends in the variables from the app, which are normally invisible to traditional banks.

EXPERT SYSTEMS WITH APPLICATIONS (2021)

Article Management

Multilayer network analysis for improved credit risk prediction

Maria Oskarsdottir, Cristian Bravo

Summary: This study presents a multilayer network model for credit risk assessment and finds that including centrality multilayer network information in the model can significantly improve predictive gains. The results suggest that default risk is highest when an individual is connected to many defaulters, but this risk can be mitigated by the size of the individual's neighborhood, showing that default risk and financial stability propagate throughout the network.

OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE (2021)

Article Geography, Physical

Deep residential representations: Using unsupervised learning to unlock elevation data for geo-demographic prediction

Matthew Stevenson, Christophe Mues, Cristian Bravo

Summary: LiDAR technology provides detailed three-dimensional elevation maps of urban and rural landscapes. This paper proposes a convenient task-agnostic tile elevation embedding method using unsupervised Deep Learning. The potential of the embeddings is tested by predicting deprivation indices, showing improved performance compared to using standard demographic features alone. The paper also demonstrates the coherent tile segments generated by the embedding pipeline using Deep Learning and K-means clustering.

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING (2022)

Article Public, Environmental & Occupational Health

Subjective machines: Probabilistic risk assessment based on deep learning of soft information

Mario P. Brito, Matthew Stevenson, Cristian Bravo

Summary: This study explores the use of machine learning methods in simulating expert risk assessments, proposes a natural language-based probabilistic risk assessment model, and validates its feasibility.

RISK ANALYSIS (2022)

Article Management

A transformer-based model for default prediction in mid-cap corporate markets

Kamesh Korangi, Christophe Mues, Cristian Bravo

Summary: In this paper, the study focuses on analyzing mid-cap companies using a large dataset of US mid-cap companies observed over 30 years. The researchers use transformer models to predict default probability term structure and determine the most influential data sources for default risk. The results show that the proposed deep learning architecture outperforms traditional models and provides an importance ranking for different data sources using a Shapley approach.

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH (2023)

Article Computer Science, Artificial Intelligence

On the combination of graph data for assessing thin-file borrowers' creditworthiness

Ricardo Munoz-Cancino, Cristian Bravo, Sebastian A. Rios, Manuel Grana

Summary: This paper introduces an information-processing framework that combines feature engineering, graph embeddings, and graph neural networks to improve credit scoring models. The results show that this approach outperforms traditional methods in assessing creditworthiness using social interaction data. Additionally, in the field of corporate lending, considering the relationships between companies and other entities is crucial for evaluating thin-file companies. The study also highlights the significant value of graph data in helping companies with little or no credit history enter the financial system.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Article Computer Science, Artificial Intelligence

On the dynamics of credit history and social interaction features, and their impact on creditworthiness assessment performance

Ricardo Munoz-Cancino, Cristian Bravo, Sebastian A. Rios, Manuel Grana

Summary: Credit risk management has been using credit scoring models at different stages for over half a century. Social network data has been shown to increase the predictive power of these models, especially when historical data is limited. This study analyzes the dynamics of creditworthiness assessment and finds that credit scoring based on borrowers' history improves performance initially and then stabilizes. The use of social network features adds value to credit scoring for loan applications and throughout the study period for business scoring.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Proceedings Paper Computer Science, Interdisciplinary Applications

Statistical Network Similarity

Pierre Miasnikof, Alexander Y. Shestopaloff, Cristian Bravo, Yuri Lawryshyn

Summary: Graph isomorphism is an intractable problem, and computing graph similarity metrics is NP-hard. However, assessing (dis)similarity between networks is crucial in various fields. This article proposes a statistical approach to answer questions about network similarity and difference, using distance matrices and probability distributions. The comparison focuses on connectivity and community structure rather than observable graph characteristics. Experimental results with synthetic and real-world graphs validate the effectiveness and accuracy of the proposed technique.

COMPLEX NETWORKS AND THEIR APPLICATIONS XI, COMPLEX NETWORKS 2022, VOL 2 (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Assessment of Creditworthiness Models Privacy-Preserving Training with Synthetic Data

Ricardo Munoz-Cancino, Cristian Bravo, Sebastian A. Rios, Manuel Grana

Summary: Credit scoring models are the primary instrument used by financial institutions to manage credit risk. However, research on behavioral scoring is scarce due to difficulties in data access. This study presents a methodology for evaluating model performance when trained with synthetic data and applied to real-world data. Results show that the quality of synthetic data decreases as the number of attributes increases, and models trained with synthetic data show a slight reduction in performance compared to those trained with real data.

HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2022 (2022)

Article Computer Science, Artificial Intelligence

Improving healthcare access management by predicting patient no-show behaviour

David Barrera Ferro, Sally Brailsford, Cristian Bravo, Honora Smith

DECISION SUPPORT SYSTEMS (2020)

No Data Available