

Author: Song Qiang
Publisher: Inderscience Publishers
ISSN: 1743-8187
Source: International Journal of Business Intelligence and Data Mining, Vol.7, Iss.3, 2012-10, pp. : 152-171
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
Since the Internet is sufficiently established, information on the Web is significantly enriched every day. It induces a fact that the information on Web pages has become increasingly useful in daily life. Therefore, it has become very common for us to refer to information on the Web, particularly when writing documents or programs. If we want to revisit the same Web pages to modify some part of a file later, it can be very hard to track down the Web pages originally referred to. In this paper, we propose methods for extracting relationships between files and Web pages based on the co-occurrence of data in Web-access logs and file-access logs. These relationships are very useful for revisiting Web pages related to target files. There are two approaches for merging the logs to analyse co-occurrence in these two types of access logs, involving a trade-off between accuracy and execution time. We call them the Pre-Merge and Post-Merge methods. We have evaluated these two methods using actual access logs.
Related content


Optimal Algorithms for Finding User Access Sessions from Very Large Web Logs
By Chen Z.
World Wide Web, Vol. 6, Iss. 3, 2003-09 ,pp. :




By Brook Jenny
New Review of Information Networking, Vol. 9, Iss. 1, 2003-01 ,pp. :


By Schrefl M. Bernauer M. Kapsammer E. Proll B. Retschitzegger W. Thalhammer T.
Information Systems, Vol. 28, Iss. 8, 2003-12 ,pp. :


Aesthetics and preferences of web pages
Behaviour and Information Technology, Vol. 19, Iss. 5, 2000-09 ,pp. :