4.7 Article

Predicting the potential habitat of oaks with data mining models and the R system

Journal

ENVIRONMENTAL MODELLING & SOFTWARE
Volume 25, Issue 7, Pages 826-836

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.envsoft.2010.01.004

Keywords

Habitat modelling; Supervised classification; R system; Data mining models; Ensemble models; Classification trees; Neural networks; Oaks; Support vector machines

Funding

  1. Spanish Ministry of Education and Science [MTM2004-01433]
  2. Institute of Statistics of Andalusia [OG-154/07]
  3. Andalusia Environment Government [OG-096/01]

Ask authors/readers for more resources

Oak forests are essential for the ecosystems of many countries, particularly when they are used in vegetal restoration. Therefore, models for predicting the potential habitat of oaks can be a valuable tool for work in the environment. In accordance with this objective, the building and comparison of data mining models are presented for the prediction of potential habitats for the oak forest type in Mediterranean areas (southern Spain), with conclusions applicable to other regions. Thirty-one environmental input variables were measured and six base models for supervised classification problems were selected: linear and quadratic discriminant analysis, logistic regression, classification trees, neural networks and support vector machines. Three ensemble methods, based on the combination of classification tree models fitted from samples and sets of variables generated from the original data set were also evaluated: bagging, random forests and boosting. The available data set was randomly split into three parts: training set (50%), validation set (25%), and test set (25%). The analysis of the accuracy, the sensitivity, the specificity, together with the area under the ROC curve for the test set reveal that the best models for our oak data set are those of bagging and random forests. All of these models can be fitted by free R programs which use the libraries and functions described in this paper. Furthermore, the methodology used in this study will allow researchers to determine the potential distribution of oaks in other kinds of areas. (C) 2010 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available