Recent Changes - Search:


edit SideBar


Obtaining Calibrated Probabilities from Boosting,

A. Niculescu-Mizil and R. Caruana, Proc 21st Conf. Uncertainty in AI,2005.

We are all now familiar with the idea that Support Vector Machines and Boosting are state-of-the-art methods for pattern classification. However, a basic appreciation of the workings of these algorithms, particularly when compared on a theoretical basis to older methods like Neural Networks and Nearest Neighbour Classifiers, should motivate us to ask questions. In fact these newer techniques generally have an advantage for large scale problems, but this advantage is bought at the cost of compromising the underlying theory. They are quick and dirty.

Optimal algorithms are based upon probabilistic principles and must either estimate the probabilities of classification, or position decision boundaries consistent with them. Despite the rather restricted title, this paper contains all of the evidence you need to see with respect to obtaining such meaningful outputs from a wide variety of methods, including those I have already mentioned, plus Random Forests and Bagging.

A critical assessment of why particular algorithms behave as shown here may help to refine your idea of state-of-the-art and the valuable role the older techniques still have to play in data processing.

NAT 1/2/2013

Edit - History - Print - Recent Changes - Search
Page last modified on February 01, 2013, at 04:14 PM