Journal
BIOSYSTEMS ENGINEERING
Volume 213, Issue -, Pages 89-104Publisher
ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.biosystemseng.2021.11.021
Keywords
Total nitrogen content; Modified digital camera; Feature selection; Regression algorithm
Funding
- National Natural Science Foun-dation of China [32071761, 31670642]
Ask authors/readers for more resources
With the development of imaging devices and image processing algorithms, a hybrid feature selection approach was developed for total nitrogen content (TNC) estimation in Aquilaria sinensis. The approach combines different feature selection methods and regression algorithms to improve the accuracy of the estimation models.
With the development of imaging devices and image processing algorithms, numerous features have come to be used for the estimation of total nitrogen content (TNC) in plants. However, higher-dimensional inputs contain more correlated variables that can detrimentally affect model performance. In this study, a hybrid feature selection approach was developed for TNC estimation in Aquilaria sinensis. A low-cost modified digital camera with external filters was used to capture canopy images. Three feature selection methods, namely, random forest (RF), Pearson correlation coefficient (PCC)-based feature selection, and sequential backward selection (SBS), were combined into two hybrid feature selection algorithms (RF_SBS and PCC_SBS). In addition, three regression algorithms were used in hybrid feature selection process: random forest regression (RFR), support vector regression (SVR), and partial least squares regression (PLSR). The hybrid feature selection process consists of two steps. First, the lowest number of dimensions is sought based on the feature ranking. Then, SBS is used to find the best feature combinations. Compared with the original models, the R(2 )values of the RF-SBS-based models are improved by 0.094 (RF_SBS_RFR), 0.190 (RF_SBS_SVR), and 0.116 (RF_SBS_PLSR), while the R2 values of the PCCSBS-based models are improved by 0.055 (PCC_SBS_RFR), 0.092 (PCC_SBS_SVR) and 0.128 (PCC_SBS_PLSR). Finally, the two best TNC estimation models are found to be PCC_SBS_PLSR, with an R2 of 0.863, and RF_SBS_SVR, with an R-2 of 0.872. The proposed hybrid feature selection approach not only has great capacity to improve estimation accuracy but also can reduce model complexity by choosing the best feature subset. (C) 2021 IAgrE. Published by Elsevier Ltd. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available