Journal
APPLIED SOFT COMPUTING
Volume 62, Issue -, Pages 807-816Publisher
ELSEVIER
DOI: 10.1016/j.asoc.2017.09.010
Keywords
Decision tree; Differential privacy; Ensemble; Maximal votea
Categories
Funding
- Fundamental Research Funds for the Central Universities [30916015104]
- Chinese National Natural Science Foundation [91646116]
- National Key Research and Development Program [2016YFE0108000]
- Key Research and Development Program [SBE2017741114, SBE2017030519]
- Scientific and Technological Support Project (Society) of Jiangsu Province [BE2016776]
Ask authors/readers for more resources
In decision tree classification with differential privacy, it is query intensive to calculate the impurity metrics, such as information gain and gini index. More queries imply more noise addition. Therefore, a straightforward implementation of differential privacy often yields poor accuracy and stableness. This motivates us to adopt better impurity metric for evaluating attributes to build the tree structure recursively. In this paper, we first give a detailed analysis for the statistical queries involved in decision tree induction. Second, we propose a private decision tree algorithm based on the noisy maximal vote. We also present an effective privacy budget allocation strategy. Third, to boost the accuracy and improve the stableness, we construct the ensemble model, where multiple private decision trees are built on bootstrapped samples. Extensive experiments are executed on real datasets to demonstrate that the proposed ensemble model provides accurate and reliable classification results. (C) 2017 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available