Publisher: IGI Global_journal
E-ISSN: 1942-9037|9|2|1-13
ISSN: 1942-9045
Source: International Journal of Software Science and Computational Intelligence (IJSSCI), Vol.9, Iss.2, 2017-04, pp. : 1-13
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
The rise of online P2P lending, as a novel economic lending model, brings new opportunities and challenges for the research of credit risk evaluation. This paper aims to mine information from different data sources to improve the performance of credit risk evaluation models. Be-sides the personal financial and demographic data used in traditional models, the authors collect in-formation from (1) text description, (2) social network and (3) macro-economic data. They de-sign methods to extract features from unstructured data. To avoid the curse of dimensionality caused by too many features and identify the key factors in credit risk, the authors remove the irrelevant and redundant features by feature selection. Using the data provided by Prosper.com, one of the biggest P2P lending platforms in the world, they show that: (1) it can achieve better performance, measured by both AUC (area under the receiver operating characteristic curve) and classification accuracy, by fusion of information from different data sources; (2) it requires only ten features from different data sources to get better performance.