Web Mining: A Synergic Approach Resorting to Classifications and Clustering ( River Publishers Series in Information Science and Technology  )

Publication series :River Publishers Series in Information Science and Technology 

Author: Kumbhar> V.S.  

Publisher: River Publishers‎

Publication year: 2016

E-ISBN: 9788793379848

P-ISBN(Paperback): 9788793379831

Subject: TP274 数据处理、数据处理系统

Keyword: 计算技术、计算机技术,自动化技术、计算机技术

Language: ENG

Access to resources Favorite

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Description

Web mining is the application of data mining strategies to excerpt learning from web information, i.e. web content, web structure, and web usage data. With the emergence of the web as the predominant and converging platform for communication, business and scholastic information dissemination, especially in the last five years, there are ever increasing research groups working on different aspects of web mining mainly in three directions. These are: mining of web content, web structure and web usage. In this context there are good number of frameworks and benchmarks related to the metrics of the websites which is certainly weighty for B2B, B2C and in general in any e-commerce paradigm. Owing to the popularity of this topic there are few books in the market, dealing more on such performance metrics and other related issues. This book, however, omits all such routine topics and lays more emphasis on the classification and clustering aspects of the websites in order to come out with the true perception of the websites in light of its usability. In nutshell, Web Mining: A Synergic Approach Resorting to Classifications and Clustering showcases an effective methodology for classification and clustering of web sites from their usability point of view. While the clustering and classification is accomplished by using an open source tool WEKA, the basic dataset for the selected websites has been emanated by using a free tool site-analyzer. As a case study, several commercial websites

Chapter

1.9.3 Science and Engineering

1.9.4 Human Rights

1.9.5 Medical Data Mining

1.9.6 Spatial Data Mining

1.9.7 Challenges in Spatial Mining

1.9.8 Temporal Data Mining

1.9.9 Sensor Data Mining

1.9.10 Visual Data Mining

1.9.11 Music Data Mining

1.9.12 Pattern Mining

1.9.13 Subject-based Data Mining

1.9.14 Knowledge Grid

1.10 Trends in Data Mining

1.10.1 Application Exploration

1.10.2 Scalable and Interactive Data Mining Methods

1.10.3 Integration of Data Mining with Database Systems, Data Warehouse Systems, and Web Database Systems

1.10.4 Standardization of Data Mining Query Language

1.10.5 Visual Data Mining

1.10.6 New Methods for Mining Complex Types of Data

1.10.7 Biological Data Mining

1.10.8 Data Mining and Software Engineering

1.10.9 Web Mining

1.10.10 Distributed Data Mining

1.10.11 Real-Time Data Mining

1.10.12 Multi-Database Data Mining

1.10.13 Privacy Protection and Information Security in Data Mining

1.11 Classification Techniques in Data Mining

1.11.1 Definition of the Classification

1.11.2 Issues Regarding Classification

1.11.3 Evaluation Methods for Classification

1.11.4 Classifications Techniques

1.11.4.1 Tree structure

1.11.4.2 Rule-based algorithm

1.11.4.3 Distance-based algorithms

1.11.4.4 Neural networks-based algorithms

1.11.4.5 Statistical-based algorithms

1.12 Applications of Classifications

1.12.1 Target Marketing

1.12.2 Disease Diagnosis

1.12.3 Supervised Event Detection

1.12.4 Multimedia Data Analysis

1.12.5 Biological Data Analysis

1.12.6 Document Categorization and Filtering

1.12.7 Social Network Analysis

1.13 WEKA: An Effective Tool for Data Mining

1.13.1 Main Features of theWeka

1.13.2 Weka Interface

1.13.3 Weka for Classification

1.13.3.1 Selecting a classifier

1.13.3.2 Test options

1.14 WhatWe Aim to Cover Through the Present Book

Chapter 2 - Current Literature Assessment in Data and Web Mining

2.1 Big Data and Its Mining

2.2 Data-Processing Basics

2.3 Data Mining

2.4 PioneeringWork

2.5 Algorithms Used in Data Mining

2.6 Classification and Mining

2.7 Performance Metrics of Classification/Mining

2.8 Data Mining forWeb

2.9 Categories ofWeb Data Mining

2.10 Radial Basis Function Networks

2.11 J48 Decision Tree

2.12 Naive Bayes

2.13 Support Vector Machine (SVM)

2.14 Conclusion andWay Forward

Chapter 3 - DataSet Creation for Web Mining

3.1 Introduction

3.2 Web Mining—Emerging Model of Business

3.2.1 Introduction toWeb Mining

3.3 Tools Used for Acquisition of Parameters

3.3.1 Accessibility

3.3.2 Design

3.3.3 Texts

3.3.4 Multimedia

3.3.5 Networking

3.4 Difficulties Encountered

3.4.1 Internet Problem

3.4.2 Preparation and Selection ofWebsites

3.4.3 Difficulty in Selecting Analysis Tool

3.4.4 Unavailability of Data

3.5 Flowchart

3.6 Freezing Parameters

3.6.1 Data Preprocessing

3.6.1.1 Data Preprocessing Techniques

3.6.2 Preprocessing and Filtering

3.6.2.1 Preprocessed and Filtered Overall Data

3.6.2.2 Preprocessed and FilteredWeb Accessibility Data

3.6.2.3 Preprocessed and Filtered Design Data

3.6.2.4 Preprocessed and Filtered Texts Data

3.6.2.5 Preprocessed and Filtered Multimedia Data

3.6.2.6 Preprocessed and Filtered Networking Dat

3.7 Way Forward

Chapter 4 - Classification of Websites

4.1 Introduction

4.1.1 Accessibility

4.1.2 Design

4.1.3 Texts

4.1.4 Multimedia

4.1.5 Networking

4.2 Classification ofWebsites on Accessibility

4.2.1 Dataset

4.2.2 Clustering

4.2.3 Clustered Instances

4.2.4 Classification Via Clustering

4.2.4.1 Classification via clustering using J48 algorithm

4.2.4.2 Classification via clustering using RBFNetwork algorithm

4.2.4.3 Classification via clustering using NaiveBayes algorithm

4.2.4.4 Classification via clustering using SMO algorithm

4.2.4.5 Comparison of above classification algorithms

4.3 Classification Based onWebsite Design

4.3.1 Attribute Selection

4.3.2 Clustering

4.3.3 Cluster Analysis

4.3.4 Classification Through Clustering

4.3.4.1 Classification via clustering using J48 algorithm

4.3.4.2 Classification via clustering using RBFNetwork algorithm

4.3.4.3 Classification via clustering using NaiveBayes algorithm

4.3.4.4 Classification via clustering using SMO algorithm

4.3.4.5 Comparison of above classification algorithms

4.4 Classification Based on Text

4.4.1 Feature Selection

4.4.2 Clustering

4.4.3 Cluster Analysis

4.4.4 Classification Through Clustering

4.4.4.1 Classification via clustering using J48 algorithm

4.4.4.2 Classification via clustering using RBFNetwork algorithm

4.4.4.3 Classification via clustering using NaiveBayes algorithm

4.4.4.4 Classification via clustering using SMO algorithm

4.4.4.5 Comparison of above classification algorithms

4.5 Classification Based on Multimedia Content of Websites

4.5.1 Feature Selection

4.5.2 Clustering

4.5.3 Cluster Analysis

4.5.4 Classification Through Clustering

4.5.4.1 Classification via clustering using J48 algorithm

4.5.4.2 Classification via clustering using RBFNetwork algorithm

4.5.4.3 Classification via clustering using NaiveBayes algorithm

4.5.4.4 Classification via clustering using SMO algorithm

4.5.4.5 Comparison of above classification algorithm

4.6 Classification Based on Network Analysis ofWebpage

4.6.1 Feature Selection

4.6.2 Clustering

4.6.3 Observations

4.6.4 Classification Through Clustering

4.6.4.1 Classification via clustering using J48 algorithm

4.6.4.2 Classification via clustering using RBFNetwork algorithm

4.6.4.3 Classification via clustering using NaiveBayes algorithm

4.6.4.4 Classification via clustering using SMO algorithm

4.6.4.5 Comparison of the above classification algorithm

4.7 Classification ofWebsites Using Overall Performance

4.7.1 Clustering

4.7.2 Cluster Analysis

4.7.3 Classification Via Clustering

4.7.3.1 Classification via clustering using J48 algorithm

4.7.3.2 Classification via clustering using RBFNetwork algorithm

4.7.3.3 Classification via clustering using NaiveBayes algorithm

4.7.3.4 Classification via clustering using SMO algorithm

4.7.3.5 Comparison of the above classification algorithms

4.8 Results at a Glance and Conclusion

4.9 Summary and Future Directions

Index

About the Authors

Back Cover

The users who browse this book also browse