4.7 Article

Using autoencoders to compress soil VNIR-SWIR spectra for more robust prediction of soil properties

Journal

GEODERMA
Volume 393, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.geoderma.2021.114967

Keywords

VNIR-SWIR spectroscopy; SOC; Spectral chromophores; Explainable artificial intelligence; XAI; Vis-NIR

Categories

Ask authors/readers for more resources

Research in the past two decades has focused on using near infrared diffuse reflectance spectroscopy as a rapid and cost-effective method for soil analysis, given the importance of the soil ecosystem and increasing pressures from climate change, degradation, and urbanization. Sophisticated machine learning models and spectral pre-processing techniques are used to overcome the complexity of the soil matrix. A novel methodology using stacked autoencoders is proposed to enhance the accuracy of prediction in soil properties.
In the past two decades, research efforts have focused on using near infrared diffuse reflectance spectroscopy (350-2500 nm) as a rapid and cost-effective method for soil analysis. Given the multi-faceted importance of the soil ecosystem, and considering the increasing pressures exerted upon it due to climate change, degradation and urbanization, the advantages of soil spectroscopy are significant. Large soil spectral libraries have been developed to this end throughout the world. The soil matrix is however complex, posing a challenge in the determination of key soil properties from the spectra. To tackle this challenge, two methodologies are generally used: a) the use of spectral pre-processing techniques to transfer the spectra into a new space in which the association between spectrum and soil property is supposed to be more clear, and b) the use of more sophisticated machine learning models (e.g. deep learning). In this paper, we propose a novel methodology using stacked autoencoders to transform the initially recorded spectra in a new compressed (i.e. latent) space which can help the chemometric models enhance the accuracy of prediction. This is an unsupervised learning approach which only depends on the input data (i.e. the spectra). Following the significant results obtained in the literature using combinations of different spectra pre-processing techniques and the simultaneous prediction of multiple soil properties, the proposed methodology is extended to facilitate these approaches. We demonstrate this capacity by applying it in the mineral samples of the LUCAS 2009 topsoil database, and simultaneously predicting eight properties (the particle size distribution, pH, CEC, organic carbon, calcium carbonate, and total nitrogen) using an artificial neural network. Compared to standard pre-processing techniques and other transformations such as the principal components space, the proposed methodology using only one spectral source as input decreases the RMSE on average by 8.4% and by 3.5%, respectively. With respect to the current state-of-the-art and in particular a multi-input convolutional neural network which was recently proposed and outperformed the compared methodologies, the results of our multi-input methodology exhibit an average RMSE decrease of 9.9%. The interpretability aspect of the transformed feature space and the compressed spectra was also examined to identify how the compressed information encodes the input data and enables better associations between input and output.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available