4.8 Article

Contrast trees and distribution boosting

Publisher

NATL ACAD SCIENCES
DOI: 10.1073/pnas.1921562117

Keywords

machine learning; prediction diagnostics; boosting; quantile regression; conditional distribution estimation

Ask authors/readers for more resources

A method for decision tree induction is presented. Given a set of predictor variables x = (x(1), x(2), ..., x(p)) and two outcome variables y and z associated with each x, the goal is to identify those values of x for which the respective distributions of y vertical bar x and z vertical bar x, or selected properties of those distributions such as means or quantiles, are most different. Contrast trees provide a lack-of-fit measure for statistical models of such statistics, or for the complete conditional distribution p(y)(y vertical bar x), as a function of x. They are easily interpreted and can be used as diagnostic tools to reveal and then understand the inaccuracies of models produced by any learning method. A corresponding contrast-boosting strategy is described for remedying any uncovered errors, thereby producing potentially more accurate predictions. This leads to a distribution-boosting strategy for directly estimating the full conditional distribution of y at each x under no assumptions concerning its shape, form, or parametric representation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available