4.7 Article

Estimating the total nitrogen content of Aquilaria sinensis leaves based on a hybrid feature selection algorithm and image data from a modified digital camera

Journal

BIOSYSTEMS ENGINEERING
Volume 213, Issue -, Pages 89-104

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.biosystemseng.2021.11.021

Keywords

Total nitrogen content; Modified digital camera; Feature selection; Regression algorithm

Funding

  1. National Natural Science Foun-dation of China [32071761, 31670642]

Ask authors/readers for more resources

With the development of imaging devices and image processing algorithms, a hybrid feature selection approach was developed for total nitrogen content (TNC) estimation in Aquilaria sinensis. The approach combines different feature selection methods and regression algorithms to improve the accuracy of the estimation models.
With the development of imaging devices and image processing algorithms, numerous features have come to be used for the estimation of total nitrogen content (TNC) in plants. However, higher-dimensional inputs contain more correlated variables that can detrimentally affect model performance. In this study, a hybrid feature selection approach was developed for TNC estimation in Aquilaria sinensis. A low-cost modified digital camera with external filters was used to capture canopy images. Three feature selection methods, namely, random forest (RF), Pearson correlation coefficient (PCC)-based feature selection, and sequential backward selection (SBS), were combined into two hybrid feature selection algorithms (RF_SBS and PCC_SBS). In addition, three regression algorithms were used in hybrid feature selection process: random forest regression (RFR), support vector regression (SVR), and partial least squares regression (PLSR). The hybrid feature selection process consists of two steps. First, the lowest number of dimensions is sought based on the feature ranking. Then, SBS is used to find the best feature combinations. Compared with the original models, the R(2 )values of the RF-SBS-based models are improved by 0.094 (RF_SBS_RFR), 0.190 (RF_SBS_SVR), and 0.116 (RF_SBS_PLSR), while the R2 values of the PCCSBS-based models are improved by 0.055 (PCC_SBS_RFR), 0.092 (PCC_SBS_SVR) and 0.128 (PCC_SBS_PLSR). Finally, the two best TNC estimation models are found to be PCC_SBS_PLSR, with an R2 of 0.863, and RF_SBS_SVR, with an R-2 of 0.872. The proposed hybrid feature selection approach not only has great capacity to improve estimation accuracy but also can reduce model complexity by choosing the best feature subset. (C) 2021 IAgrE. Published by Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available