COMBINING MULTIPLE CLASSIFIERS USING DEMPSTER'S RULE FOR TEXT CATEGORIZATION

Author: Bi Yaxin  

Publisher: Taylor & Francis Ltd

ISSN: 1087-6545

Source: Applied Artificial Intelligence, Vol.21, Iss.3, 2007-03, pp. : 211-239

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

In this paper we investigate the combination of four machine learning methods for text categorization using Dempster's rule of combination. These methods include Support Vector Machine (SVM), kNN (Nearest Neighbor), kNN model-based approach (kNNM), and Rocchio. We first present a general representation of the outputs of different classifiers, in particular, modeling it as a piece of evidence by using a novel evidence structure called focal element triplet. Furthermore, we investigate an effective method for combining pieces of evidence derived from classifiers generated by a 10-fold cross-validation. Finally, we evaluate our methods on the 20-newsgroup and Reuters-21578 benchmark data sets and perform the comparative analysis with majority voting in combining multiple classifiers along with the previous result. Our experimental results show that the best combined classifier can improve the performance of the individual classifiers and Dempster's rule of combination outperforms majority voting in combining multiple classifiers.