

Author: Bi Yaxin
Publisher: Taylor & Francis Ltd
ISSN: 1087-6545
Source: Applied Artificial Intelligence, Vol.21, Iss.3, 2007-03, pp. : 211-239
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
In this paper we investigate the combination of four machine learning methods for text categorization using Dempster's rule of combination. These methods include Support Vector Machine (SVM), kNN (Nearest Neighbor), kNN model-based approach (kNNM), and Rocchio. We first present a general representation of the outputs of different classifiers, in particular, modeling it as a piece of evidence by using a novel evidence structure called focal element triplet. Furthermore, we investigate an effective method for combining pieces of evidence derived from classifiers generated by a 10-fold cross-validation. Finally, we evaluate our methods on the 20-newsgroup and Reuters-21578 benchmark data sets and perform the comparative analysis with majority voting in combining multiple classifiers along with the previous result. Our experimental results show that the best combined classifier can improve the performance of the individual classifiers and Dempster's rule of combination outperforms majority voting in combining multiple classifiers.
Related content




Combining symbolic classifiers from multiple inducers
By Baranauskas J.A. Monard M.C.
Knowledge-Based Systems, Vol. 16, Iss. 3, 2003-04 ,pp. :




Improving text categorization using the importance of sentences
Information Processing and Management, Vol. 40, Iss. 1, 2004-01 ,pp. :


LVQ for text categorization using a multilingual linguistic resource
By Martin-Valdivia M.T. Garcia-Vega M. Urena-Lopez L.A.
Neurocomputing, Vol. 55, Iss. 3, 2003-10 ,pp. :